DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.63k stars 557 forks source link

drmemtrace scheduler should synthesize headers for cores that start idle? #6703

Open derekbruening opened 6 months ago

derekbruening commented 6 months ago

The drmemtrace framework stores key info in headers at the start of each trace file. This is meant to be accessible to parallel worker threads in analysis tools and simulators. However, when there are more output streams than inputs, some outputs start idle and so have no headers at all. (These outputs might later host inputs, so they will not necessarily always be idle.) This causes problems for analyzers that need to know the version or filetype or cache line size or chunk size or whatnot in every shard. One solution could be for the scheduler to always read ahead to the first timestamp, store global values of the common header records, and synthesize headers in outputs that start idle.

Even this is not enough for record_filter to operate core-sharded in #6635 as it also needs the input filename extension so it knows how to compress the outputs, but that particular detail seems reasonable to leave as a burden for record_filter.