Open skyreflectedinmirrors opened 1 year ago
Use cases:
- often there are significant run-to-run variation of an application due to the inherent randomness, e.g., for Monte-Carlo simulations.
Well, realistically, a Monte Carlo application (or really any stochastic simulation) should have a way to explicitly specify the seeds for the RNG, otherwise they basically wouldn't be able to do any validation.
Why do we need to even rely on rocprof to do application replay? Doing a whole application replay is trivial to implement without forking. LD_PRELOAD library with wrapper around __libc_start_main
+ env variable specifying total number of replays + env variable specifying the current replay count. If current < total, then increment current replay count env variable (and anything else) and recursively use execvpe.
Basically, you'd just build a library with something like main.c in omnitrace and implement that logic after the call to main_real
That's an interesting thought. One does wonder what the heck rocprof would make of multiple runs inside the same process with different sets, as it's the one who's actually cycling through various sets of counters. It seems like that would work well with a rocprofiler tool wrapper where we are controlling the collected counters though
execve basically replaces the current program with a new program:
execve() executes the program referred to by pathname. This causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and (initialized and uninitialized) data segment
as it's the one who's actually cycling through various sets of counters.
This doesn't sound particularly complicated to me once you figure out the number of HW counter slots available. And it would theoretically allow us to create a scheme similar to how omnitrace uses the PID to tag output file names and support multiprocess collection
Use cases:
Some possible short-term solutions: