Closed ceres-c closed 3 years ago
Hi @ceres-c, I'm not too familiar with de4js, I stay mostly within the JS ecosystem. Can you provide the settings or the deobfuscated output for each in a gist?
The online deobfuscator doesn't deobfuscate the first JS resource either, on the default settings anyway.
I'm very sorry, but apparently, when I first posted this issue in the other repo, this was one of those "It's 3 AM and I've been awake way too much" moments. Tired me probably mixed up js files and thought one of them was correctly deobfuscated while, in fact, it was not and I probably did simply load the wrong one in de4js. So, I withdraw my statement: the scripts are not deobfuscated. Also, in the second script, the first function is only minified, not obfuscated, so... Yeah, I pretty much got everything wrong.
How would you go about deobfuscating something like this. Not looking for step by step instructions, of course, but some pointers. I have seen some examples of control flow reconstruction accomplished through symbolic execution ( https://blog.quarkslab.com/deobfuscation-recovering-an-ollvm-protected-program.html ) but I don't know if similar techniques could be applied to JS as well
Updating this after quite some time (had to deal with uni exams) I ended up lifting the AST to a CFG, then basically had to implement a decompiler (based on angr, with some substantial changes) in order to reconstruct branching and looping blocks, then output another AST and, finally, code. Now I have readable and working code, it's not anymore a huge state machine with goto-like constructs. I'll publish everything in a couple of months when I'm done writing my thesis
Now I'm facing another deobfuscation task: simplifying strings declarations and, possibly variable assignments. I have constructs such as
var o = "o";
o += "na";
o += "udiopr";
o += "oc";
o += "e";
o += "s";
o += "s";
Up to now I have written a simple reducer (based on LazyCloneReducer) to analyze the file row by row, detect reads/writes/assignments and I'm keeping track of variables (literals only) in a map structure, in order to update them line by line until they're actually read with their final value.
Problem is: this approach isn't really sound. I have issues with ComputedMemberAssignmentTarget (Ei[67] = r
) and StaticMemberAssignmentTarget (Qe.type = xo
) since I don't have a real heap storing data and zero knowledge on the data type during analysis. I'm not sure I want to get into symbolic execution, to be honest
Do you have any suggestion @jsoverson? Maybe I can't see some obvious and industry standard solution to this kind of issues
I've finally published the code here, in case anyone is interested https://github.com/ceres-c/bulldozer
Hello, I'm reverse engineering some scripts from AliExpress and they are using an obfuscator unknown to me and I've been redirected here from https://github.com/lelinhtinh/de4js/. The obfuscator I'm facing is partially supported by de4js, since some js are successfully deobfuscated, some only partyially and some others not at all. I'm very sorry to open such an uninformative issue plainly asking for help, but I'm not really into JS and obfuscation, so all of this is new to me. Would you mind having a look?
Javascript which can be fully deobfuscated
Javascript which can be partially deobfuscated (only the first function)
Javascript which can't be deobfuscated
From an external point of view, these files look pretty much the same regarding the techniques applied (string chunking and rotation, control flow flattening via giant switch cases...), but the deobfuscator does not like some of them. Sorry to bother you.