Storyyeller / Krakatau

Java decompiler, assembler, and disassembler
GNU General Public License v3.0
1.95k stars 218 forks source link

Krakatau has huge performance problems with certain obfuscation #133

Open Janmm14 opened 6 years ago

Janmm14 commented 6 years ago

Rename sample from .zip to .jar Original sample: sample.zip Preprocessed sample with removed nonsense: sample-preprocessed.zip

I tried running with the original and with the preprocessed sample.

Command: pypy decompile -path spigot-1.8.8.jar;rt.jar;jce.jar -out out/ -nauto -skip -xmagicthrow Output

onEnable()V is in class FqMTHFBILjMwOWY4VD5EWV1E9MZLMUY8EN2A9PDCK6GAEDBCNC2DT2FZ6VPO91WQHEKJB4H7KXSYYRVDYF6X0NELJU06B9FSBJ4VPHJ3A0EESSQDA7A3YGCHN6F2MLJVb

Decompiling method onEnable ()V
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 14 blocks)
Duplicating 4 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 5 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 14 blocks)
Duplicating 4 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 14 blocks)
Duplicating 4 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 57 blocks)
Duplicating 7 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 14 blocks)
Duplicating 4 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 43 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 5 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 14 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 5 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 5 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 23 blocks)
Duplicating 7 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 7 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 222 blocks)
Duplicating 4 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 15 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 13 blocks)
Duplicating 6 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 5 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 17 blocks)
Duplicating 5 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 14 blocks)
Duplicating 7 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 3 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 6 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (3 entry points, 4 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 7 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 7 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 7 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 7 blocks)
Duplicating 3 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 92 blocks)
Duplicating 15 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 9 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 70 blocks)
Duplicating 50 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 50 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 4 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 48 blocks)
Duplicating 28 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 4 blocks)
Duplicating 2 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 26 blocks)
Duplicating 10 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 26 blocks)
Duplicating 17 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 25 blocks)
Duplicating 5 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 5 blocks)
Duplicating 1 nodes
Warning, multiple entry point loop detected. Generated code may be extremely large (2 entry points, 4 blocks)
Duplicating 2 nodes

Then nothing happens for >10 minutes (its still running). Removing -xmagicthrow flag does not help.

Janmm14 commented 6 years ago

Update: It just finished after ~1937 seconds aka ~33 minutes.

I would like to see some performance improvements here if possible.

Decompile output of that one class is btw ~7.1 MB high.

Storyyeller commented 6 years ago

I'll take a look when I have time to see if there are any obvious improvements, but I can't guarantee it. There are some things that are inherently difficult to decompile, and there are some known performance issues with Krakatau that can't be fixed without a major rewrite.

At any rate, it does at least print out warnings in this case that the decompilation may be slow.

Storyyeller commented 6 years ago

FYI, your preprocessing appears to have silently corrupted the jar. For example, the field names and types are incorrect in the preprocessed jar.

Storyyeller commented 6 years ago

I looked through the code you posted, and I think this is a mixture of WAI and WNF.

The Krakatau decompiler only does local analysis. If the code contains stuff like if (false), that will get optimized away by the decompiler, but it deliberately does not do any higher level interprocedural analysis. A lot of obfuscators will insert fake control flow with "opaque predicates", where they do stuff like if (foo) and foo is a field that turns out to always be false at runtime. Unfortunately, there is no way to recognize these automatically; it requires human judgement and analysis, or at least deobfuscation tools specific to the obfuscator being used.

In the particular case of the code you posted, there is one major flag used for fake control flow. JVc.JVa is only incremented when JVb.JVi is true, and JVb.JVi is only toggled when JVc.JVa is nonzero. Therefore, they will always be false.

After I patched reads of JVc.JVa to return 0, the decompilation time went from 21 minutes to under 2 minutes. I attached the patched version as well as the decompilation. The nice thing about Krakatau's local optimizations is that it means that all you need to do to clean up the code is to replace the opaque predicates with their constant values, and the optimizer will take care of the rest automatically.

There is one more opaque branch left in onEnable, with the field JVf. Unfortunately, the value it is assigned to comes from reflection, so I can't tell offhand which it should be, but I took the liberty of trying it both ways and including the results. The files JVF_is55.java and JVF_not55.java contain the simplified code assuming the flag is 55 or not 55 respectively. Of course, there's still a lot of reflection and string encryption in the code, but I hope this provides a good start.

Apart from that, there's still a massive amount of indentation. I changed the indentation to only 1 space so it is easier to read. The unnecessary nesting when decompiling try blocks is a known issue of Krakatau, but it can't be fixed without completely rewriting the structuring code, and that is highly unlikely to ever happen.

decompiled.zip

sample2.zip

Janmm14 commented 6 years ago

Thanks for the answer.

FYI, your preprocessing appears to have silently corrupted the jar. For example, the field names and types are incorrect in the preprocessed jar. from @Storyyeller

All I did was removing the invalid class signature, and stuff related to simplifying instructions, field names and types did not change!? (Just double-checked with krakatau disassemble) In case you wonder why the preprocess doesnt run anymore, its because it does have integrity protection stuff.

Edit: wtf lol

Storyyeller commented 6 years ago

Look at the field JVa in class JVc. It should be a static int, but in the preprocessed version, it got changed.

Janmm14 commented 6 years ago

Can't imagine ASM5 having such problems, going to look at my code. Thanks for notifying me.

iocmet commented 2 months ago

I encountered same problem and i currently waiting for ~1 hour and nothing happens image

Storyyeller commented 2 months ago

The basic problem here is that bytecode may have irregular control flow (i.e. loops with multiple entry points) whereas Java can't. Therefore, in order to faithfully replicate it in Java, you have to duplicate the code, which could lead to an exponential increase in code size if there are nested loops. The alternative is to just output pseudocode containing gotos.