draperlaboratory / cbat_tools

Program analysis tools developed at Draper on the CBAT project.
MIT License
101 stars 14 forks source link

Initialize the memory pool with it's "real" values #335

Closed codyroux closed 3 years ago

codyroux commented 3 years ago

Use BAP to read memory from various constant pools, e.g. .rodata (and possibly others) in order to avoid some false positives.

It seems that the best way to do this is to use Bap.Std.Image, along with the Ogre.Query.select functionality over Scheme.section (or is it Scheme.named_region?), to actually collect all those bytes and assert the relevant memory constraints.

Note that this probably conflicts with some of our existing options (notably rewrite-addresses) so we need to provide this functionality as a flag.

I'm also not sure what to do if the data differs between the binaries in the comparative mode.

codyroux commented 3 years ago

Tagging @nickroess for interest.

codyroux commented 3 years ago

First remark: The only way to implement this (it seems) without doing horrible hacks is to ingest a Project.t instead of what we currently ingest, i.e. a pair of program term * Target.t.

This requires a rewrite to the front end of CBAT.

In addition, the reason we work with programs rather than projects is because loading the project multiple times in our large-scale analyses is very slow.

We propose the following solution: if the whole poject info is not needed (init-rodata flag not set), we create a dummy project, with only the program and target filled in, and cache that if needed.

This could be a bit finicky, since there's no easy way to know if we're loading a cached "fat" project or a cached "thin" one without trying to read some fields. Not sure what the right behavior is (probably check that we are reading from an "expected" cache).