Open pinwhell opened 1 year ago
Have you seen the "Instruction Pattern Search" feature? From the Code Browser, Search -> For Instruction Patterns
.
yes i am aware of it, but i just saw the possibility to find such well masked patterns, but i didn't saw the possibility to automatically create one, perfectly masked, is already there such feature of creating the perfectly masked pattern? am i missing something?
You might take a look at YaraGhidraGUIScript.java
, but I don't think that's what you're asking for.
If I understand correctly, you would like to have better control over the masking, to be able to do things like accept a set of registers or a range of addresses for a given operand, instead of either masking out an operand completely or fixing all of the bits. It's not a bad idea, and it might fit with some of our planned work, so I'll put the "Future" tag on this ticket.
More precisely, the concept I was describing in my original request involves an automated way of generating patterns, allowing for different levels of control:
Fine-grained Control: This would involve masking or wild-carding at the bit level. For instance, a pattern like "AA B? CC ?D" or "AA 00 | 10101010 10101010" could be created. This method provides more precision but can be complex.
Coarse-grained Control: This would entail masking or wild-carding at the byte level. For example, a pattern like "AA ?? CC ??" could be generated. This approach is simpler but provides less granularity.
Both of these wild-carding methods serve a purpose. Method 1 is highly precise and allows for intricate pattern specifications, while Method 2 is more straightforward but offers less detailed control.
To illustrate the kind of feature I'm proposing, I've prepared an example and a proof of concept from a tool I've developed:
Consider the following byte array:
9C 00 9F E5 00 10 A0 E3 00 00 9F E7 04 10 8D E5 00 00 90 E5 74 10 90 E5 00 00 51 E3
As you can see, treating this array as a robust pattern isn't practical due to potential changes in offsets caused by relocations, image updates, or other factors, with the tool, it automatically recognizes the instruction and apply a wild-carding technique to all the applicable instructions bytes:
0x118e368: ldr r0, [pc, #0x9c] {1,1,0,0}
0x118e36c: mov r1, #0 {1,1,1,0}
0x118e370: ldr r0, [pc, r0]
0x118e374: str r1, [sp, #4] {1,1,0,0}
0x118e378: ldr r0, [r0]
0x118e37c: ldr r1, [r0, #0x74] {1,1,0,0}
0x118e380: cmp r1, #0 {1,1,0,0}
? ? 9F E5 ? ? ? E3 00 00 9F E7 ? ? 8D E5 00 00 90 E5 ? ? 90 E5 ? ? 51 E3
resulting in a relatively more robust pattern....
but the main thing is, this pattern was generated automatically, based on those set of rules, this is kinda the feature i was proposing, to include in Ghidra, maybe do it at a more core level with micro-instructions, then translate to actual real instructions like x86-ARM ...
a lot of time will be saved when making code signature patterns that works!.
it would be very very cool to somehow kinda have a feature that automaticly make such patterns, with the ability to wildcard opcodes related to IMM/MEMDISP, or even instruciton, making the pattern very robust and effective.
i have done my own tool, using capstone, but it would be even more amazing if we had something like this withing GHIDRA, that kinda work for each Arch
i see this feature kinda complex, i see like a simple way of it, and a complex way of it.
lets consider this instruction 00 00 90 E5 => ldr r0, [r0, #0x0]
lets consider increasing a very high memory displacement
FF 0F 90 E5 => ldr r0, [r0, #0xFFF]
when i say simple, i mean not going to precisely do a bitmask wildcard for it, but simply, whildcarding the bytes itself that changes, we can clearly see that byte 0x0 & 0x1, changed, meaning that the feature in simple way should be expected to output
? ? 90 E5
for this given instruction, as you can see, this simple way, lack of microscopic precision, becouse:
on the other side, there could be a precisely surgical wildcarding at a bitlevel, this way we have more control in what we want to wildcard, for example just Memory Disp, Immediates or even Registers.
Thanks guys!