tud-zih-energy / lo2s

Linux OTF2 Sampling - A Lightweight Node-Level Performance Monitoring Tool
https://tu-dresden.de/zih/forschung/projekte/lo2s?set_language=en
GNU General Public License v3.0
44 stars 13 forks source link

Ring-buffer Inter-Process Interface #302

Open cvonelm opened 10 months ago

cvonelm commented 10 months ago

CUPTI PC Sampling (see #294) can only be done from the program that executes the CUDA Kernels itself.

This means that implementing CUPTI support in lo2s is only possible by creating a separate CUPTI sampling support library and using LD_PRELOAD to inject it into the application under measure.

This of course needs some mechanism for the injected library to communicate with lo2s itself, most likely using a ring buffer over shared-memory.

As such a foreign interface might be useful outside of the CUPTI directly, i think this inter-process interface warrants its own discussion.

There are two direct questions:

  1. How should the technical solution look like? shm_open+mmap+own ring buffer implementation, or is there already a turnkey solution for it?
  2. How much genericity should we bake into the design?