accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.
https://accel-sim.github.io
Other
294 stars 114 forks source link

Google ProtoBuf for Accel-Sim trace file #280

Open William-An opened 7 months ago

William-An commented 7 months ago

Current Accel-Sim trace file is text-based, unnecessarily consuming a large amount of storage space. Although the trace size can be cut to 1/10 with accel-sim#265 compression trick, another problem with text-based trace format is the complexity of consuming the trace. A parser has to be built to convert the text traces into binary data structures for Accel-Sim or other analysis tools. Considering the fact that most of the time traces are used to drive the simulator rather than examined by humans, it is more convenient to have it in binary form and build a decoder to convert it into text form when needed.

Therefore, I propose using the Goolge ProtoBuf as a substitute for the text trace format. It has the following benefits over current format:

  1. Compact in size and support general compression, so the compression PR can also work.
  2. Multi-language support with C++, C#, Python, Java, etc. Alleviate the need to write parsers for analysis tools designed to work with our traces
  3. Can support trace streaming, so the simulator won't need to read the whole file into memory, reducing runtime memory usage
  4. Backward compatibility: with some rules followed, old code and read new trace files just fine, so just need to keep 1 copy of the trace file for all past simulator versions.
  5. Used by Gem5 simulator (Trace CPU) and Nsight compute profiler for trace collection, proved to work by existing projects.
tgrogers commented 7 months ago

I like the idea. I have not used protobufs before. Are they stored in binary format on disk? If so, are there utilities to parse and print the trace so we can still do manual text inspection on demand? It is quite convenient sometimes to just look at the traces and see what they are doing.

William-An commented 7 months ago
  1. Yes protobufs are stored in binary format on disk
  2. I don't think they provide such an official tool, but it should be easy to build one ourselves, basically just add a bunch of prints to the generated parser.