DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.57k stars 552 forks source link

Add first-class support for analyzing core-sharded-on-disk traces #6685

Closed derekbruening closed 4 months ago

derekbruening commented 4 months ago

Split from #6635 where we'll be producing core-sharded trace files from thread-sharded originals and storing them on disk (as opposed to just analyzing them in a core-sharded schedule #5694).

Adding a new type of trace file adds complexity, and we had originally hoped to always re-schedule the original traces: but there are multiple use cases where having the core schedule permanently on disk is useful. This includes feeding to simulators that are not using our dynamic scheduler.

The plan is to tap into the existing dynamic core-sharded analysis from #5694. We would add a new filetype marking the trace files as core-sharded. The scheduler will have to read ahead to get the filetype: which may cause headaches with input ordinals.

We'll want to update the invariant checker so we can use it on such traces: #6684.


Pasting from https://github.com/DynamoRIO/dynamorio/issues/6635#issuecomment-1967420976

For analyzing core-sharded-on-disk files: my approach is to add a new filetype indicating core-sharded, and have the scheduler read every input to find its filetype record so the analyzer can check the filetype at init time.

That all works fine for typical runs: but it breaks many specific modes, from the readahead:

Breaks online: scheduler init just blocks! => new option read_inputs_in_init, turned off for ipc readers.

Breaks inv checks b/c input record ord is too far => using output ord until 1st instr.

Breaks unit tests which have no filetype records => stopping readahead at pagesize marker.

Breaks replay-as-traced b/c input record ord is too far => disabling read-ahead for as-traced.

Breaks record filter legacy null check b/c legacy trace has no filetype and it reads the timestamp which means the stream's last-timestamp is now > stop-timestamp and so the resulting output trace has the filter-end marker 1st thing!

$ zcat ../src/clients/drcachesim/tests/drmemtrace.legacy-for-record-filter.x64.tracedir/drmemtrace.threadsig.10506.7343.trace.gz | od -A x -t x2 -w12 | awk '{printf "%s | %s %s %s%s%s%s\n", $1, $2, $3, $7, $6, $5, $4}' | head
000000 | 0019 0000 0000000000000001
00000c | 0016 0004 000000000000290a
000018 | 0018 0004 000000000000290a
000024 | 001c 0002 002ede0c00c77676
000030 | 001c 0003 0000000000000001

=>

$ zcat record_filter_tests_tmp_output/null_filter/drmemtrace.threadsig.10511.8961.trace.gz | od -A x -t x2 -w12 | awk '{printf "%s | %s %s %s%s%s%s\n", $1, $2, $3, $7, $6, $5, $4}' | head
000000 | 001c 0017 0000000000000000
00000c | 0019 0000 0000000000000001
000018 | 0016 0004 000000000000290f