Closed woodruffw closed 5 years ago
(A good middle ground between just the main module and instrumenting every module would probably be just instrumenting those modules that correspond to functions selected by the user.)
https://github.com/trailofbits/sienna-locomotive/commit/22bc85359b4b334531747fcabffe65b1707b8ae4#diff-532f2cabe3dad1a76c49c9ab7974b983R71 should work for the middle-ground solution if we comment out the first check in the if
statement.
This is not, unfortunately, as simple as the commit linked.
The first issue is that we want the callers, but we're copying the callees. I think the right way to do this is to copy all the module_data_t *
's into one set and keep a separate set with pointers to the ones that call targeted functions. When a function hook fires, we find the module it lives in and add it to the second set if the function is targeted.
The second problem is that our arenas now depend on the set of functions selected instead of just the target binary and arguments. If you change the targeted functions, and it changes which modules we measure, you'll (rightfully) get a wildly different coverage score.
Right now, we only instrument basic blocks in the main module. That means that if
test.exe
links tofoo.dll
andbar.dll
, we'll only record a change in coverage caused by a new code path inbar.dll
if it also results in a new code path intest.exe
.That's been true most of the time, but SumatraPDF provides a counterexample: pretty much all document parsing happens in
libmupdf.dll
with the main module (sumatrapdf.exe
) only providing the rendering thread, so we're probably missing important coverage changes.