DexPatcher / multidexlib2

Multi-dex extensions for dexlib2
https://dexpatcher.github.io/
GNU General Public License v3.0
68 stars 34 forks source link

MultiDexIO: consecutive writes causes dropping of previous dex files #7

Closed auermich93 closed 3 years ago

auermich93 commented 3 years ago

Hi,

I just wanna shift the discussion to this place as the issues are related to multidexlib2. I tried to use the library as follows:

            File apkFile = new File(apkPath);

            // decode the APK file using apktool d -s
            decodedAPKPath = Utility.decodeAPK(apkFile);

            MultiDexContainer<? extends DexBackedDexFile> apk =
                    MultiDexIO.readMultiDexContainer(true, new File(decodedAPKPath),
                            new BasicDexFileNamer(), null, null);

            // instruments the dex files, e.g. adds additional methods, etc.
            apk.getDexEntryNames().forEach(dexFile -> {
                try {
                    instrument(apk.getEntry(dexFile).getDexFile(), dexFile, exclusionPattern);
                } catch (IOException e) {
                    LOGGER.warn("Failure loading dexFile");
                }
            });

            // add flags to manifest, insert entire smali class, etc

            // build the APK using apktool
            Utility.buildAPK(decodedAPKPath, outputAPKFile);

I originally tried to specify directly the APK file, but this failed with an exception in this ByteStreamHack class. Since I need to decode the APK anyways, this doesn't really matter. The APK contains two dex files (classes.dex and classes2.dex) where the former has 64693 method references and the latter solely ~ 3k. Then, my instrumentation process follows, which inserts among other things missing lifecycle methods in every activity/fragment. This, in turn leads to a classes.dex file that exceeds the method reference limit. The instrumentation procedure is similar to the following: https://gist.github.com/JesusFreke/6945806 btw. I basically request the MutableMethodImplementation objects and call on them addInstruction(), replaceInstruction(), etc. I also insert entire methods by creating such objects on my own. At the end of each call to instrument(), my code executes the following:

    // the file path refers to the decoded APK path (apktool -d, see above)
    // the classes list specifies all classes that should belong into the dex file (collected during instrumentation)
    public static void writeToDexFile(String filePath, List<ClassDef> classes, int opCode) throws IOException {

        DexFile dexFile = new DexFile() {
            @Nonnull
            @Override
            public Set<? extends ClassDef> getClasses() {
                return new AbstractSet<ClassDef>() {
                    @Nonnull
                    @Override
                    public Iterator<ClassDef> iterator() {
                        return classes.iterator();
                    }

                    @Override
                    public int size() {
                        return classes.size();
                    }
                };
            }

            @Nonnull
            @Override
            public Opcodes getOpcodes() {
                return Opcodes.forApi(opCode);
            }
        };
        MultiDexIO.writeDexFile(true, new File(filePath), new BasicDexFileNamer(),
                dexFile, DexIO.DEFAULT_MAX_DEX_POOL_SIZE, null);
    }

I have to specify here again a directory (a file is not working, which seems reasonable since the original classes.dex should be split into two dex files, as the method reference limit is hit). Although no exception occurrs, the resulting APK only contains a single classes.dex file, which refers to the 'classes2.dex' file of the original APK. I inspected the directory after the first instrumentation round and it contained the files classes.dex and classes2.dex as expected. However, somehow the instrumentation of the original classes2.dex overwrites the newly created dex files. In particular, the second call to MultiDexIO.writeDexFile() is responsible for this. Moreover, the MultiDexContainer object created by 'MultiDexIO.readMultiDexContainer()' is in an invalid state at this point. Iterating over the dex entries for example causes problems since the underlying dex files are not present anymore. Any suggestions? Are two consecutive writes via MultiDexIO.writeDexFile() to the same directory not allowed? Do I need to specify another DexFileNamer object in this case?

Lanchon commented 3 years ago

I originally tried to specify directly the APK file, but this failed with an exception in this ByteStreamHack class.

too bad you didn't report it. i suspect its a usage issue. anyway...

apk.getDexEntryNames().forEach

no, the apk IS a DexFile in itself! it contains the merged content of all dex files. you operate on your code as if the source was a mono dex.

when it's time to output, you write the mdex container with a single call passing a single DexFile.

Lanchon commented 3 years ago

the mdexlib2 readme contains this line:

Sample: DexPatcher's file processor is a simple yet production-quality client of multidexlib2.

go follow the link and see how to use this lib.

you read the mdex like this: https://github.com/DexPatcher/dexpatcher-tool/blob/e301fb823a857e684c1dd4503a92fee30f966785/tool/src/main/java/lanchon/dexpatcher/Processor.java#L299

and you write the mdex like this: https://github.com/DexPatcher/dexpatcher-tool/blob/e301fb823a857e684c1dd4503a92fee30f966785/tool/src/main/java/lanchon/dexpatcher/Processor.java#L317-L318

it's that simple: just forget you're doing multidex at all.

one caveat: the output MUST be run on android 5 or later. android 4 will not run it.

to support android 4, ask me.

auermich93 commented 3 years ago

I originally tried to specify directly the APK file, but this failed with an exception in this ByteStreamHack class.

too bad you didn't report it. i suspect its a usage issue. anyway...

That's the stack trace if you want to look into it:

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.io.ByteStreams.toByteArray(Ljava/io/InputStream;J)[B
    at com.google.common.io.ByteStreamsHack.toByteArray(ByteStreamsHack.java:30)
    at lanchon.multidexlib2.RawDexIO.readRawDexFile(RawDexIO.java:54)
    at lanchon.multidexlib2.ZipFileDexContainer.<init>(ZipFileDexContainer.java:48)
    at lanchon.multidexlib2.MultiDexIO.readMultiDexContainer(MultiDexIO.java:61)
    at lanchon.multidexlib2.MultiDexIO.readMultiDexContainer(MultiDexIO.java:48)
    at lanchon.multidexlib2.MultiDexIO.readMultiDexContainer(MultiDexIO.java:39)

apk.getDexEntryNames().forEach

no, the apk IS a DexFile in itself! it contains the merged content of all dex files. you operate on your code as if the source was a mono dex.

when it's time to output, you write the mdex container with a single call passing a single DexFile.

So, you gonna tell me that the forEach loop returns only a single dex entry (a merged dex file). However, this is not the case. My instrumentation method gets called twice.

auermich93 commented 3 years ago

the mdexlib2 readme contains this line:

Sample: DexPatcher's file processor is a simple yet production-quality client of multidexlib2.

go follow the link and see how to use this lib.

you read the mdex like this: https://github.com/DexPatcher/dexpatcher-tool/blob/e301fb823a857e684c1dd4503a92fee30f966785/tool/src/main/java/lanchon/dexpatcher/Processor.java#L299

and you write the mdex like this: https://github.com/DexPatcher/dexpatcher-tool/blob/e301fb823a857e684c1dd4503a92fee30f966785/tool/src/main/java/lanchon/dexpatcher/Processor.java#L317-L318

it's that simple: just forget you're doing multidex at all.

one caveat: the output MUST be run on android 5 or later. android 4 will not run it.

to support android 4, ask me.

OK, I will have a look at it. I don't think that android 4 support is necessary.

Lanchon commented 3 years ago

That's the stack trace if you want to look into it:

you are replacing guava with some other version. this is causing the error.

So, you gonna tell me that the forEach loop returns only a single dex entry (a merged dex file)

no. i told you the variable 'apk' ALREADY IS a DexFile. you just use it as a DexFile. you do not loop on anything.

Lanchon commented 3 years ago

just forget multidex. multidexlib2 read gives you A SINGLE unified DexFile. and you pass multidexlib2 write A SINGLE unified DexFile. just code your app as if only a single DexFile per apk existed. mdexlib2 does everything behind the scenes to support multidex I/O. it couldn't be any simpler.

auermich93 commented 3 years ago

That's the stack trace if you want to look into it:

you are replacing guava with some other version. this is causing the error.

I don't understand this. I simply replaced the file parameter of the MultiDexIO.readMultiDexContainer() method. When I specify a directory it is working, when I specify an APK it is not. I don't use guava at all (nor any alternative), at least I couldn't find any import statement nor any guava package in the locally included JAR files. Apart from that, I include via gradle log4j2 and your dependency.

So, you gonna tell me that the forEach loop returns only a single dex entry (a merged dex file)

no. i told you the variable 'apk' ALREADY IS a DexFile. you just use it as a DexFile. you do not loop on anything.

I got it, the MultiDexContainer is simply misleading. One shouldn't use it all with multidexlib2. Just out of curiousity, android 4 is not supported because it doesn't use ART? In that sense, would it be possibly to split the dex files arbitrarily since AOT compilation happens from android 5 onwards and the original dex files are solely an intermediate representation?

Lanchon commented 3 years ago

I don't understand this.

i strongly suspect that a newer guava is being pulled in as somebody's transitive dependency. whatever be the case, this will be fixed in the next release.

MultiDexContainer is simply misleading

no it's not, it provides you a unified view of the multidex (the one you should use) and also a per-dex-file view (for stuff like android 4 support).

android 4 is not supported because it doesn't use ART?

yes. ART supports multidex natively, while Dalvik VM requires app magic, dynamic code loading, and a multidex support library. the magic cannot be solved automatically, meaning that supporting android <5 with multidex requires effort on your part for each and every case. i strongly discourage you to waste your time on this.