java-deobfuscator / deobfuscator

The real deal
https://javadeobfuscator.com
Apache License 2.0
1.59k stars 297 forks source link

Support for caesium #738

Open re-dude69 opened 3 years ago

re-dude69 commented 3 years ago

I recently came across a new obfuscation tool that I had not seen before.

So far I am not aware of any deobfuscators for this one. The project is located at https://github.com/sim0n/caesium.

I have tried to deobfuscate it, but it seems like the caesium classes it produces are not recognized by deobfuscator.

Output excerpt from running detection:

...
[main] ERROR com.javadeobfuscator.deobfuscator.Deobfuscator - Could not parse oshi/util/Util.class/caesium_26.class (is it a class file?)
java.lang.ArrayIndexOutOfBoundsException
...

Are you interested in adding support for this, unless there already is?

Janmm14 commented 3 years ago

Run custom build of deobfuscator with https://github.com/java-deobfuscator/deobfuscator/blob/master/src/main/java/com/javadeobfuscator/deobfuscator/Deobfuscator.java#L89 set to true

If you dont add any transformers, it will output a jar without those fake classes and you can continue your analysis from there.

re-dude69 commented 3 years ago

Thank you Janmm14. That solved the initial problem.

Deobfuscator would not let me run without any transformers, so I did a new detection on the file to detect transformers to use.

user@host:~$ java -jar deobfuscator-paramorphic.jar --config detect.yml 
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Loading classpath
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Loading input
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Detecting known obfuscators
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - 
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - RuleSourceFileAttribute: Some obfuscators don't remove the SourceFile attribute by default. This information can be recovered, and is very useful
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator -    Found possible SourceFile attribute on oshi/driver/linux/Dmidecode: 
l
I
l
I
Il
I
IIIIIIlI
l
Ill
I
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Recommend transformers:
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - (Choose one transformer. If there are multiple, it's recommended to try the transformer listed first)
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator -    com.javadeobfuscator.deobfuscator.transformers.normalizer.SourceFileClassNormalizer

Based on the output I added com.javadeobfuscator.deobfuscator.transformers.normalizer.SourceFileClassNormalizer as the only transformer, and ran into an error using this transformer. The error I got requested me to open a ticket with the information:

user@host:~$ java -jar deobfuscator-paramorphic.jar --config classfixer.yml 
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Loading classpath
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Loading input
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Computing callers
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Transforming
[main] INFO com.javadeobfuscator.deobfuscator.Deobfuscator - Running com.javadeobfuscator.deobfuscator.transformers.normalizer.SourceFileClassNormalizer
[SourceFileClassNormalizer] Recovered 355 source filenames

Deobfuscation failed. Please open a ticket on GitHub and provide the following error:
java.lang.IllegalArgumentException
    at org.objectweb.asm.Type.getTypeInternal(Type.java:443)
    at org.objectweb.asm.Type.getType(Type.java:177)
    at org.objectweb.asm.commons.Remapper.mapDesc(Remapper.java:55)
    at org.objectweb.asm.commons.ClassRemapper.visitAnnotation(ClassRemapper.java:122)
    at org.objectweb.asm.tree.ClassNode.accept(ClassNode.java:412)
    at com.javadeobfuscator.deobfuscator.transformers.normalizer.AbstractNormalizer.lambda$transform$0(AbstractNormalizer.java:47)
    at java.base/java.util.HashMap$Values.forEach(HashMap.java:976)
    at com.javadeobfuscator.deobfuscator.transformers.normalizer.AbstractNormalizer.transform(AbstractNormalizer.java:42)
    at com.javadeobfuscator.deobfuscator.Deobfuscator.runFromConfig(Deobfuscator.java:435)
    at com.javadeobfuscator.deobfuscator.Deobfuscator.start(Deobfuscator.java:392)
    at com.javadeobfuscator.deobfuscator.DeobfuscatorMain.run(DeobfuscatorMain.java:120)
    at com.javadeobfuscator.deobfuscator.DeobfuscatorMain.run(DeobfuscatorMain.java:113)
    at com.javadeobfuscator.deobfuscator.DeobfuscatorMain.main(DeobfuscatorMain.java:50)

user@host:~$ 
Janmm14 commented 3 years ago

Do not use com.javadeobfuscator.deobfuscator.transformers.normalizer.SourceFileClassNormalizer as transformer here, it only makes the result worse/undecompilable in this case. For a transformer without harm go for com.javadeobfuscator.deobfuscator.transformers.general.removers.SyntheticBridgeRemover

However on first glance caesium seems to use other method signatures etc. to deobfuscate so you'd have to write your own transformer for it as well.

re-dude69 commented 3 years ago

Thanks. I simply patched out the part of java-deobfuscator that checked number of transformers were greater than zero or not null, which allowed me to successfully run without any transformers added.

By investigating the code, it primarily seems like I need to add transformers for the following:

I have never written any transformers before and have to understand fully how they work. I will look into this and post updates here if I manage to get something working.

CUSTOMTRANSFORMER.md seems to explain mostly what is required, so I just have to get familiar with how everything is put together.

Janmm14 commented 3 years ago

@brownbananas95 What I am doing usually is run peephole transformers first, then I look at the code (with Krakatau/pypy in Helios Decompiler) and the bytecode instructions at the same time. First I look for further possible simplifications of bytecode patterns which existing peephole transformers were unable to handle. Then I improve those or write a new one. Then I look that there are no calls to side-effect-free methods with constant parameters, if there are I improve/write a transformer to handle that.

What I want at the end is that the arguments of a call to a deobfuscation method are directly before the method call, so it can be removed easily & cleanly. Some transformers in this project do not use this simplification approach and use ASM Analyzers or ArgsAnalyzer instead, as I never used those personally I have no tips for those.

You can use the InstructionPattern, I personally still prefer just using InsnList#iterator() in a while loop.

Then I write code to check if iterator.next() matches my first bytecode instruction i'm looking for and then check if insn.getNext() fits. Or better I check if it doesn't fit and then call continue;. Usually the first time I encounter some encryption instruction pattern in a class, I'll have to first execute the method of the class to initialize the decryption routines and then I execute the individual decrypt method calls and replace the instructions in the methods where I encountered them (using the iterator's set() and remove() methods) with a constant load of the result. Optionally you can attempt to remove the decryption methods and fields afterwards.

Renaming of method names obviously has to come last, the deobfuscator already has normalizers for that.

For try-catch blocks I consider that like removal of decryption routines as polishing and usually I don't need that as Krakatau can still show java code with complex try-catch patterns, might just look more complicated than it actually is.

re-dude69 commented 3 years ago

Thank you for the detailed explanation on how to do this. I spent some hours looking at this over the weekend, but it seems like I underestimated the amount of hours required to get it done.

I will continue looking at it when I have free time.

Perhaps this information could get added to CUSTOMTRANSFORMER.md so it can be helpful for others too?

Janmm14 commented 3 years ago

Do you have a sample jar which you can share (see #653)?

re-dude69 commented 3 years ago

Sure. I have uploaded the sample here. It comes from an open repository and is not copyrighted, and not shaded with copyrighted materials.

Dependencies are however copyright, so I have not uploaded those. I guess #653 considers dependencies a resource, but if not I can provide those too if that helps.

Janmm14 commented 3 years ago

@brownbananas95

I've redone some basic stuff of my private fork here, but with a little cleaner code: https://github.com/Janmm14/deobfuscator/tree/caesium This mgiht be another sample of how you could code some transformers.

Basic tips for handling java bytecode is to have https://en.wikipedia.org/wiki/Java_bytecode_instruction_listings open always somewhere and to get really familiar with how a Stack works.

That is however just cleaning up the bytecode a little. I don't want to recode everything and I don't want to publish my whole work to not give obfuscation creators hintshow to harden their obfuscation, how I split between this-is-obfuscation-code and this-is-real-program-logic and because the code I use is ugly and is often just based on one sample and might work wrongly or not at all on different obfuscated samples.

I've used that code with the transformers

Then I used my private fork to deobfuscate the rest.

i've edited stuff in there forthis project so that i don't need those nonfree libs. I just added rt, jce and jna jars as libs. whne you get an error that there's no provider for some method you'll have to add the missing method mapping to the jvmmethodprovider until its working.

I don't know if it is required for this obfuscation, but I've implemented invokedynamic in the methodexecutor. if you get an executionexception with an unsupportedoperationexception as cause, you'll have to implement invokedynamic in there or try the javavm execution instead.

My private fork had these two transformers from some time ago as part of its PeepholeTransformer which I ran first with it: https://gist.github.com/Janmm14/fef55378560e0e447f3ee883c5ab5907 (slightly modified These simplified another couple things.

Then transformers for "...".length(), Integer.reverse(int) and Long.reverse(long) (they're part of my peephole) Then I replaced ACONST_NULL followed by IFNULL with a single GOTO Deadcoderemover found another bunch of removable instructions and then there was round two of all the peepholetransformers.

Now the bytecode should bee kind of ordered and arguments to the deobfuscation calls should be right infront of the call instruction and we just have to search for some simple patterns, trim the clinit for simulated execution and replace what we want with our result.

For that you start with a basic delegationprovider and init it like usual (jvmmethodprovider, jvmcomparisionprovider, mappedmethodprovider, mappedfieldprovider)

Create a Set to remember initialized classes. Then loop through all classes.

Search for a static method and create a new identical MethodNode. Create a new InsnList for it and copy instructions from the original clinit until you encounter invokedynmaic or new, then add a RETURN instead to the insnlist and break; Then execute your own clinit via the methodexecutor. Now the decryption fields etc are initialized.

So we can start actually deobfuscating. First look for invokedynamic calls. If we see one, we make sure that the bsm owner is the same class. If its no fitting invokedynamic instruction, just continue;

Then you execute the bsm of the invokedynamic instruction. first argument for it can be null in methodexecutor, second is the insn.name, third can be null as well in the methodexecutor. The resulting JavaMethodHandle now just needs to be converted into the correct MethodInsnNode or FieldInsnNode and you can call iterator.set

then you can look for the pattern GETSTATIC String[] GETSTATIC int[] constant integer load IALOAD AALOAD and use the provider you created earlier to grab the values of the getstatic calls and replace those 5 instructions with one LDC of the final string.

similar stuff for integers (there its just GETSTATIC int[] - constant integer load - IALOAD)

if you had success you should get 3434 decrypted strings, 8620 decrypted numbers and 43978 replaced invokedynamic instructions.

further actions needed is to replace static method calls with the signature (IJ)Ljava/lang/String; followed by invokevirtual to String.length() with the computed value. simply invoke the invokestatic with methodexecutor, get the result string length and repalce the two instructions with a constant int load instruction. after this you want to run the peephole thing and uselessoperationtransformer again.

furthermore all the ifs are method calls. But if you got this far you should probably be experienced enough to detect those short methods and learn how to inline and/or simplify those.

Also at some point during your deobfuscation you can switch from krakatau to cfr, as cfr output is a little less verbose compared to krakatau and cfr can also display lambda's and method references correctly.

This is in short how I deobfuscated that sample. I could reuse a bunch of code as I already had another caesium sample already as I noticed while looking through the code.

I hope that this can help you along how to tackle such tasks. I don't know how experienced you are in java programming and with bytecode already. If this is your first touch with bytecode, do not expect that this rough guideline will get you to a finished result within just a couple hours.

also i noticed that people really like to copy from superblaubeere obfuscator.

re-dude69 commented 3 years ago

Wow. This is way beyond what I expected to get of help. I really appreciate the time you have taken to investigate this.

Currently I do not understand half of what is being explained - which makes a perfect opportunity for me to learn this, along with a guide explaining how to do this. That is great :smile:

Hopefully after successfully doing this I will be able to take on new obfuscators I come across in a similar way.

I will start working on it this week and possibly post some updates here on the status, and if I find something that can be helpful to others looking to do the same :+1:

Thanks!