DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.57k stars 551 forks source link

Pretty printer for #trace_entry_t-format traces #6751

Open edeiana opened 3 months ago

edeiana commented 3 months ago

Traces on disk are a sequence of 12 byte entries of type #trace_entry_t. Currently we visualize such traces in binary form as described here: https://dynamorio.org/page_debug_memtrace.html#autotoc_md136.

We want to implement a tool similar to view (in clients/drcachesim/tools/view.cpp) to "pretty print" on-disk-format-traces. The reason we cannot use the view tool is that it operates on a higher level representation of traces (#memref_t described here: https://dynamorio.org/sec_drcachesim_format.html) and not #trace_entry_t.

Since tools can operate on both #memref_t and #trace_entry_t representation of traces (https://dynamorio.org/sec_drcachesim_newtool.html), we believe a visualization tool for #trace_entry_t traces will be helpful for developers working on tools that manipulate #trace_entry_t.

derekbruening commented 2 months ago

Note that trace_entry_t was originally not meant to be a public interface at all. Even now with the record_filter_t tool operating on it, nearly all uses should be at the memref_t level. We want to discourage new tools like converters from our format to other formats from using trace_entry_t as they will end up duplicating caching of encodings + v2p + other non-repeated items or will make mistakes; that is one reason we never wanted it to be well-documented/publicized/supported externally.

derekbruening commented 2 months ago

Pasting from https://github.com/DynamoRIO/dynamorio/pull/6771#discussion_r1571715972

trace_entry_t is not meant to be a public interface with guaranteed stability. The public interface is the memref_t record and reader library and the libraries on top of those like the analysis tools. trace_entry_t is supposed to be treated more like a black box by users. Its individual fields are not doxygen-commented.

Hence, providing a pretty printer seems to send the wrong message, encouraging users to peer inside the black box. We would prefer things like converters to other simulator formats to use memref_t + reader_t whose abstraction layer hides a number of messy details.

The record_filter tool can be used while keeping trace_entry_t a black box, as basically all of its filters operate above the memref_t abstraction level: remove marker types; trim to timestamp values; apply cache filters. So it is a little different.

derekbruening commented 2 months ago

Further observations: