nccgroup / Cartographer

Code Coverage Exploration Plugin for Ghidra
Apache License 2.0
317 stars 26 forks source link

Use SimpleBlockModel instead of BasicBlockModel for coverage blocks #9

Closed krisi0903 closed 1 month ago

krisi0903 commented 2 months ago

The BasicBlockModel does not split basic blocks on instruction that branch outside of the function (i.e. calls). This can lead to instructions being highlighted even if they are not covered, usually when a called function never returns. Ghidra offers the SimpleBlockModel, which is a drop-in replacement (also implements the CodeBlockModel interface) and splits blocks on all flow-breaking instructions. Below, I have an example where I loaded a coverage file for an ARM binary, with only the basic block at 36c of length 1 being covered.

Current state using BasicBlockModel, coverage is extended beyond the memset call cartographer_bbm

Coverage using SimpleBlockModel, coverage is not extended beyond the memset call cartographer_sbm

datalocaltmp commented 2 months ago

Nice catch! To clarify - the program you've collected coverage for as an illustration throws a segfault at the memset and terminates - but Cartographer continues to highlight the following instructions because it's highlighting on a BasicBlockModel rather than the SimpleBlockModel?

I'm not with nccgroup but I'll merge this into my fork if my understanding is correct - thanks!

krisi0903 commented 2 months ago

Nice catch! To clarify - the program you've collected coverage for as an illustration throws a segfault at the memset and terminates - but Cartographer continues to highlight the following instructions because it's highlighting on a BasicBlockModel rather than the SimpleBlockModel?

I'm not with nccgroup but I'll merge this into my fork if my understanding is correct - thanks!

Yes, this is precisely what happens. In general, the coverage will be wrong when a function doens't return,so

We are using cartographer to analyze coverage for firmware fuzzing, so it happens often that we crash on a memory access. For the program in the screenshot, the memset might actually be fine, it was just the first example program I had at hand, and I created the coverage manually to include the first BB of the function.