Closed werame closed 4 months ago
The verify error is usually caused moreso by the exporting process being unable to compute the common types of items, and thus falling back onto java/lang/Object
as a default. Variable debug information doesn't actually play a part in that step, its purely visual and for convenience.
As for the renaming, yeah, the 2X assembler has some problems with variable scoping so the current solution is to make each variable unique. I've not seen it re-scramble names like this though in the samples I use for testing. But generally speaking the 2X assembler does have some problems which are seemingly mostly addressed in the 3X WIP branch. 3X still makes each variable unique but does so in a more reliable and reproducible fashion.
This code I'm dealing with alas uses invokedynamic
so (apparently) I have to deal with stack maps.
You always have to deal with stack frames, they've been required since Java 7.
ASM, the bytecode library Recaf uses, automatically generates them for modified classes on export. Like I mentioned above, Recaf needs to be able to compute the common types of items (List/Set --> Collection) and it it doesn't it will try to use Object
.
This is typically avoided by the usage of JPhantom which Recaf runs in the background to fill in missing type information. It sometimes fails on larger inputs though and can struggle with certain forms of obfuscation.
To give a more detailed explanation of what's happening with Type 'java/lang/Object' (current frame, stack[3]) is not assignable to 'byK'
and where the stack computation breaks down I'd have to have a sample, which I understand you cannot provide. My obfuscated samples have not made this problem occur in my usage.
It was alas one large class (>100Kb bytecode) that was causing problems for Recaf/ASM. I got it instrumented correctly with Krakatau. I had to do some manual maxlolcals and maxstack adjustments though, but I'm not sure if that's the problem with ASM. The trouble with it doing full rewrites is that's almost impossible to debug bugs in ASM itself.
Alas this big file is proprietary... and not malware. But it had malware hooks in it. The more sophisticated malware itself that was in (much) smaller classes. Recaf/ASM had no actual problem instrumenting those. I could share these... but I'm not seeing the point as Recalf didn't have issue with them.
Closing as 4.X is the only branch we're working on going forward. The semantics of the assembler are a bit different now so this is not really relevant in the new version.
I consider this almost a bug from a code comprehensibility prospective, but since the (re)assembled class seems correct, I file this as an enhancement request.
Basically, if the
var
-named variables aren't sequential in the original disassembly after some modifications, e.g. there's avar18
that appears after avar6
, the Recaf re-assembles these by renumbering them, e.g. it turnedvar18
intovar7
and similarly for quite a number of others. This makes it hard to follow if the changes are correct because it introduced numerous gratuitous name changes.I'm pretty sure that if someone submits a PR for one of your programs, you wouldn't accept it if half the variables were gratuitously renamed... Same thing here with the assembler renaming stuff.
Also, it widened the declared types of those variables it moved around to Ljava/lang/Object, which is generally ok, but I think was incorrect in one case where it upgraded a byte[] to Object. Jogging my memory on how Java stores the primitive arrays (by ref), that's still correct, but fairly counterintuitive that the Recaf assembler did that.
And the java.lang.Objs are alas not harmless. I'm getting something like
which I'm pretty sure is caused by that liberty Recaf/ASM takes. I guess it's back to using krakatau, despite it being not really maintained anymore.