pfalcon / ScratchABit

Easily retargetable and hackable interactive disassembler with IDAPython-compatible plugin API
GNU General Public License v3.0
393 stars 47 forks source link

Approach to deal with multiple entry point functions #17

Open pfalcon opened 7 years ago

pfalcon commented 7 years ago

This essentially means that different functions share some basic blocks, and gives good hint how to deal with that - there should be multiple functions, but with overlapping ranges. SAB already exports functions based on ranges, so that part is handled. And so far, I just manually patch ranges in a database, but need to think how to do that automatically. (It probably would be enough to just union all ranges for all functions, ScratchABlock, the expected consumer of this stuff, will clean up unneeded basic blocks from each function).

thesourcerer8 commented 7 years ago

What I have seen in practice are 2 sharing modes: Most of the time, functions share only the last few statements, sometimes even only the return() statement. Other cases I have seen is memset+bzero, where bzero is just a few statements additionally at the beginning of the memset function. (Or the bzero function offering a memset entry point in the middle, depending on the viewpoint). I don't remember having seen code sharing in the middle of functions. But I guess that the malware market has more advanced code sharing concepts.

pfalcon commented 7 years ago

But I guess that the malware market has more advanced code sharing concepts.

So, ScratchABit is explicitly not targeted at security/malware research areas, which is actually a problem, because 90% (I could say 99% easily) of research engineering appears to be related to those areas, and not targetting them means a strong popularity hit. But not targetting them is the explanation why ScratchABit exists at all - because existing tools involve a steep learning curve, whereas SAB is intended to be quite OK to work in the following mode: "spend 10 minutes a day, and in a year, you'll have a huge progress with reversing a particular subject which interests you".

What I have seen in practice are 2 sharing modes

Well, this ticket comes from a practical experience with reversing the default SAB's usecase - ESP8266 firmware, project hosted here: https://github.com/pfalcon/xtensa-subjects/tree/master/2.0.0-p20160809 libgcc, specifically floating-point support functions pose a problem: they are originally coded in assembly, and one module may host 2 related functions, which share good deal of code (well, more specifically, one function tail-calls inside another, and vice-versa). Here's a quick google-up of the original source: https://searchcode.com/codesearch/view/32527096/

So, this ticket talks how ScratchABit should deal with such functions to produce a suitable listing for analysis in ScratchABlock (as that's the high-level RE pipeline).