Storyyeller / Krakatau

Java decompiler, assembler, and disassembler
GNU General Public License v3.0
1.95k stars 219 forks source link

Re-assembled class fails to load (java/lang/ClassFormatError) #177

Closed minexew closed 4 years ago

minexew commented 4 years ago

I am working with a JAR file of a commercial J2ME game. With -roundtrip, everything works fine.

My workflow is as follows:

python Krakatau/disassemble.py -out $OUT $JAR | tee disassemble.log
python Krakatau/assemble.py -out $NEWJAR -r $OUT | tee assemble.log

and then manually adding back non-class resources to $NEWJAR.

However, upon trying to load the MIDlet in $NEWJAR using a J2ME emulator (Sony Ericsson SDK 2.5.0.6, Sun Java 1.6.0_45 on Windows 7 32-bit), I get this error message:

ALERT: java/lang/ClassFormatError: Bad constant tag.

followed by a low-level dump of the VM stack & locals and some statistics, but no clue of the class name, or the exact "bad" tag value. Not surprisingly, the crash occurs in com/sun/midp/midlet/MIDletState.createMIDlet.

I would be grateful for any tips on how to debug this.

minexew commented 4 years ago

The bytecode is version .version 45 3

Storyyeller commented 4 years ago

You can try to narrow down which class is at issue by only reassembling half the classes and do a binary search. (Or do them one at a time instead of a binary search if that would be faster)

Once you've figured out which classfile is causing problems, please upload it and I can see if anything seems amiss. Alternatively, if you're not able to upload the code in question, you could diff the reassembled classfile against the original and try to figure it out yourself that way.

minexew commented 4 years ago

disassemble.py outputs a lot of these, by the way:

processing target b'a.class', 100/101 remaining
Nonstandard attribute b'StackMap' 197
Nonstandard attribute b'StackMap' 110

So far that's my only clue. But I don't see how it could be related to a constant tag.

Storyyeller commented 4 years ago

It's a warning that the class contains attributes with unknown meaning and format, and thus that there's a risk of breaking things if you modify the class (including reassembly in non-roundtrip mode). You'll see similar things with Scala code. I'd recommend trying to figure out what system defines the StackMap attribute and what it is used for.

minexew commented 4 years ago

I'm onto something. StackMap seems to be specific to the CLDC profile, and I think it might be enough to preverify the code again after re-assembly.

minexew commented 4 years ago

Indeed, the problem is that the StackMap refers to classes by indexes in the constant pool, and these get messed up when re-assembling, even if the code itself stays the same.

Clearly, it is unrealistic to expect Krakatau to do a re-verification after reassembly (just the description of the algorithm is dozens of pages), so to protect the users I would suggest the following behavior:

If you agree, I can open a PR.

Storyyeller commented 4 years ago

IMO, the current behavior makes more sense. Keep in mind that Krakatau isn't specifically designed for the CLDC. It has to work with any classfile, whether they are normal Java, or CLDC, or come from Scala or whatever.

In cases where it is relevant, the user has to decide how to handle custom attributes themselves, but Krakatau does print out a warning to alert them that this is the case.

minexew commented 4 years ago

As far as I understood, the "StackMap" attribute is unique to CLDC preverification.

The warning could be improved to at least convey the fact that an attempt at re-assembly will produce code broken in a non-obvious way. Otherwise, the claim that "this tool can roundtrip any class through assembly and back into an equivalent class" is a bit misleading :)

Storyyeller commented 4 years ago

It's true for the JVM, with some caveats about reflection type APIs. But people can assign any semantics they want to non-standard attributes. You'd have the exact same issue if you were trying to modify compiled Scala code.

minexew commented 4 years ago

I guess it's a matter of point of view, but why do you say non-standard? It's defined in JSR 139, Appendix 1.

Storyyeller commented 4 years ago

I went ahead and placed a warning in the readme about this issue.