Open hammad45 opened 12 months ago
Hi Hammad, this looks really nice. I'll try to create some logs with this new mode for DXT in MPI and POSIX as well, but could you share one of your logs for testing too?
Also since this appears to change the log format, it should progress the log format versions, for example, DXT_*_VER for the affected modules in darshan-dxt-log-format.h.
I'll try to run some tests and get back with additional feedback.
It looks like this does regress for old darshan logs, it should not be a big deal to support both, but as is old logs will error out both for darshan-parser and darshan-dxt-parser, as well as pydarshan:
Error: failed to read darshan log file header. Error: darshan_log_open failed to read darshan log file header: Success.
I guess a small paragraph for the documentation might be helpful as well. Something along the lines of:
??
export DXT_ENABLE_STACK_TRACE=1
Maybe some other noteworthy remarks from your experience when implementing this :)
Hi Hammad,
Thanks for submitting this PR!
Could you provide some detailed comments/discussion on how exactly the stack traces are collected with this code? I think it would take me some time to grok all the code changes, but it will be easier if I'm able to better understand how this process is intended to be carried out. From a relatively quick first scan, it seems:
Any more elaborations there would be very welcome.
Without understanding the full changes yet, I do have a couple of higher level concerns:
Ultimately storing this stack data in the Darshan log header is almost certainly not what we want to do
DXT_STACKS
) that stores this infoThe shutdown process seems pretty inefficient. It looks like the DXT module on each process writes out it's own file at module shutdown time, but then as Darshan is shutting down and writing it's log file it has to have rank 0 read each of these per-rank files serially