ThexXTURBOXx / dex2jar

Tools to work with android .dex and java .class files
Apache License 2.0
235 stars 59 forks source link

UTF8 string too large on v53 #25

Closed aeongdesu closed 1 month ago

aeongdesu commented 2 years ago
ubuntu@fv48:~$ sh dex-tools-2.2-SNAPSHOT/d2j-dex2jar.sh pico.apk
dex2jar pico.apk -> ./pico-dex2jar.jar
GLITCH: 0004 L_a/a/_a;.<init>()V | not enough space for reading instruction
GLITCH: 000c L_/1;.<init>()V | not enough space for reading instruction
Applying workaround to method L_a/a/_a;#___a with original signature null by changing its types to java.lang.Object.
java.lang.IllegalArgumentException
        at org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:213)
        at org.objectweb.asm.ClassWriter.newUTF8(ClassWriter.java:1092)
        at org.objectweb.asm.MethodWriter.<init>(MethodWriter.java:469)
        at org.objectweb.asm.ClassWriter.visitMethod(ClassWriter.java:793)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:305)
        at org.objectweb.asm.commons.RemappingClassAdapter.visitMethod(RemappingClassAdapter.java:99)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:305)
        at com.googlecode.d2j.dex.Dex2Asm.collectBasicMethodInfo(Dex2Asm.java:285)
        at com.googlecode.d2j.dex.Dex2Asm.convertMethod(Dex2Asm.java:611)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:469)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:380)
        at com.googlecode.d2j.dex.Dex2Asm.convertDex(Dex2Asm.java:508)
        at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:180)
        at com.googlecode.d2j.dex.Dex2jar.to(Dex2jar.java:280)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.doCommandLine(Dex2jarCmd.java:112)
        at com.googlecode.dex2jar.tools.BaseCmd.doMain(BaseCmd.java:290)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.main(Dex2jarCmd.java:33)

ubuntu@fv48:~$ java --version
openjdk 11.0.16.1 2022-08-12
OpenJDK Runtime Environment Temurin-11.0.16.1+1 (build 11.0.16.1+1)
OpenJDK 64-Bit Server VM Temurin-11.0.16.1+1 (build 11.0.16.1+1, mixed mode)

apk file: https://cdn.discordapp.com/attachments/757985854588190731/1014526587410075708/vr-_01.20.00.apk

i tried to convert dex -> java but it showed these errors^ it didn't work on pxb1988's dex2jar too.

i dont know about this as well, but is it possible to fix?

ThexXTURBOXx commented 2 years ago

Error reproducible on my end. However, I have a slightly different stacktrace (are you really using v53?):

dex2jar vr-_01.20.00.apk -> .\vr-_01.20.00-dex2jar.jar
GLITCH: 0004 L_a/a/_a;-><init>()V | not enough space for reading instruction
GLITCH: 000c L_/1;-><init>()V | not enough space for reading instruction
java.lang.IllegalArgumentException: UTF8 string too large
        at org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:255)
        at org.objectweb.asm.SymbolTable.addConstantUtf8(SymbolTable.java:774)
        at org.objectweb.asm.MethodWriter.<init>(MethodWriter.java:601)
        at org.objectweb.asm.ClassWriter.visitMethod(ClassWriter.java:468)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:365)
        at org.objectweb.asm.commons.ClassRemapper.visitMethod(ClassRemapper.java:187)
        at org.objectweb.asm.ClassVisitor.visitMethod(ClassVisitor.java:365)
        at com.googlecode.d2j.dex.Dex2Asm.collectBasicMethodInfo(Dex2Asm.java:352)
        at com.googlecode.d2j.dex.Dex2Asm.convertMethod(Dex2Asm.java:746)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:549)
        at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:450)
        at com.googlecode.d2j.dex.Dex2Asm.convertDex(Dex2Asm.java:615)
        at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:146)
        at com.googlecode.d2j.dex.Dex2jar.to(Dex2jar.java:246)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.doCommandLine(Dex2jarCmd.java:103)
        at com.googlecode.dex2jar.tools.BaseCmd.doMain(BaseCmd.java:297)
        at com.googlecode.dex2jar.tools.Dex2jarCmd.main(Dex2jarCmd.java:16)

I will see if I can do anything about that

aeongdesu commented 2 years ago

@ThexXTURBOXx oh well... i was confused with original snapshot version 😔 anyways thank you, I'll wait

stefan123t commented 1 year ago

@ThexXTURBOXx thanks for maintaining this tool-chain ! I found your repo to have an issue already open on UTF8 string too large error, which is not the case upstream.

I used this APK com.hm.hemaiInstall1 v1.1.10: https://apkpure.com/s-miles-installer/com.hm.hemaiInstall1/versions

As the application I want to decompile is built in China and contains quite some UTF8 unicode strings I guess that the handling of UTF8 unicode is not yet working. Also when further decompiling the resulting jar files with the jd-gui I do see quit a couple of chinese characters, but they can not be copied from the resulting source. Some non-space characters may be included in such strings.

$ sh d2j-dex2jar.sh -f s-miles.apk 
dex2jar s-miles.apk -> ./s-miles-dex2jar.jar
java.lang.IllegalArgumentException: UTF8 string too large
    at org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:255)
    at org.objectweb.asm.SymbolTable.addConstantUtf8(SymbolTable.java:774)
    at org.objectweb.asm.SymbolTable.addConstantUtf8Reference(SymbolTable.java:1007)
    at org.objectweb.asm.SymbolTable.addConstantString(SymbolTable.java:604)
    at org.objectweb.asm.SymbolTable.addConstant(SymbolTable.java:474)
    at org.objectweb.asm.MethodWriter.visitLdcInsn(MethodWriter.java:1280)
    at org.objectweb.asm.MethodVisitor.visitLdcInsn(MethodVisitor.java:562)
    at org.objectweb.asm.commons.MethodRemapper.visitLdcInsn(MethodRemapper.java:196)
    at org.objectweb.asm.tree.LdcInsnNode.accept(LdcInsnNode.java:75)
    at org.objectweb.asm.tree.InsnList.accept(InsnList.java:144)
    at org.objectweb.asm.tree.MethodNode.accept(MethodNode.java:749)
    at com.googlecode.d2j.dex.ExDex2Asm.convertCode(ExDex2Asm.java:36)
    at com.googlecode.d2j.dex.Dex2jar$2.convertCode(Dex2jar.java:126)
    at com.googlecode.d2j.dex.Dex2Asm.convertMethod(Dex2Asm.java:821)
    at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:567)
    at com.googlecode.d2j.dex.Dex2Asm.convertClass(Dex2Asm.java:468)
    at com.googlecode.d2j.dex.Dex2Asm.convertDex(Dex2Asm.java:633)
    at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:181)
    at com.googlecode.d2j.dex.Dex2jar.doTranslate(Dex2jar.java:53)
    at com.googlecode.d2j.dex.Dex2jar.to(Dex2jar.java:281)
    at com.googlecode.dex2jar.tools.Dex2jarCmd.doCommandLine(Dex2jarCmd.java:104)
    at com.googlecode.dex2jar.tools.BaseCmd.doMain(BaseCmd.java:297)
    at com.googlecode.dex2jar.tools.Dex2jarCmd.main(Dex2jarCmd.java:16)
ThexXTURBOXx commented 1 year ago

I have taken a closer look at this issue and actually, string variables are limited to a size of 65535: https://gitlab.ow2.org/asm/asm/-/blob/master/asm/src/main/java/org/objectweb/asm/ByteVector.java?ref_type=heads#L254 This is also specified in the Java standard. I have pushed a workaround which still gives proper output for the rest of the files, but skips all files which have problems.

Rabbit0w0 commented 1 month ago

Please update the link. It's no longer working. Ps: Anyone with a sample is welcomed to send to my email

ThexXTURBOXx commented 1 month ago

@Rabbit0w0 Still works on my end: https://apkpure.com/s-miles-installer/com.hm.hemaiInstall1/downloading/V1.1.10

stefan123t commented 1 month ago

@ThexXTURBOXx @Rabbit0w0 can we maybe switch from String to Stream in order to fix the lenght limitation. I have no clue on the location in the source, just my thought what to do when we expect these lengthy in memory strings in java.

ThexXTURBOXx commented 1 month ago

The core of the problem is that the final destination within the Java bytecode is affected by the length limitation. A real fix would need to somehow get around this. MAYBE, splitting a large string into multiple strings, each with length <= 65535, works. However, this might introduce even more errors regarding the limit of data within a single Java bytecode class file. The same is true for converting the strings to byte arrays first and converting these to strings at runtime. Something that definitely works is splitting strings across multiple classes and concatenating them at runtime, but this would require a major rewrite of the whole dex2jar system, which I do not have the time for, sadly. Maybe @Rabbit0w0 comes up with some better solution, though!

Rabbit0w0 commented 1 month ago

I see. There is a long constant string which is preventing asm from writing it to standard Java bytecode. There is no need to rewrite the whole project. We just need to add a preprocessor which splits the long string and then concat it using StringBuilder (the way that Java 8 and lower uses). I will try to write a workaround for this. @ThexXTURBOXx May you please add a submodule/package for preprocessors as I am afraid that my coding style does not appeal to this project?

ThexXTURBOXx commented 1 month ago

I am a bit hesitant to add more submodules. In my opinion, it is fine if you just write the code, create a PR and I fix the style afterwards :)

Rabbit0w0 commented 1 month ago

Okay then


发件人: Nico Mexis @.> 发送时间: 2024年10月20日 18:28 收件人: ThexXTURBOXx/dex2jar @.> 抄送: CookedBunny @.>; Mention @.> 主题: Re: [ThexXTURBOXx/dex2jar] UTF8 string too large on v53 (Issue #25)

I am a bit hesitant to add more submodules. In my opinion, it is fine if you just write the code, create a PR and I fix the style afterwards :)

― Reply to this email directly, view it on GitHubhttps://github.com/ThexXTURBOXx/dex2jar/issues/25#issuecomment-2424826956, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFAIPRTKYHKH7NILTT3E4R3Z4OAWTAVCNFSM6AAAAABQIK6GGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRUHAZDMOJVGY. You are receiving this because you were mentioned.Message ID: @.***>

stefan123t commented 1 month ago

You are both awesome !