ER: Continuous run mode

stan-moravec commented 3 months ago

It would be useful to have a mode of operation where we would run forever and keep only last X seconds of data. That would allow to stop collecting on some rare event (secondary enhancement may be watching dmesg for some given message and stopping the collection, but this can be done externally for a start) and have the traces that capture the time around the rare event. Maybe just logging to memory circularly and writing only on a exit signal (the span could be limited, single buffer, but it could be enough...), or wring regularly plus on exit signal and keeping only 2 trace files. Whatever.

Just an idea for consideration - feature that could be great for some scenarios (think of cluster daemon not being scheduled once a day, for example). And, btw, THANKS for all your work on this tool.

kachini commented 3 months ago

heh.. sounds like old HP-UX KI feature.

Glad to see you, Stan :)

MarkCRay commented 3 months ago

Thanks for the submission. I now have a working prototype, so hopefully this will make it to the 7.10 release. Due to the size of the kernel ring buffer, only the last 1 second or less may be logged, but it depends on the activity for each CPU. Filtering with the runki script or kiinfo may allow for longer amount of time to be logged.

MarkCRay commented 2 weeks ago

This feature is included in the 7.10 release (Oct 14, 2024). To begin continuous logging:

$ runki -d 0

To stop continuous logging and have runki script complete the KI dump and collect the data:

$ kiinfo -likiend

Due to the size of the in-memory kernel ring buffers, you may get a few seconds of data or 200 milliseconds of data (or less).

stan-moravec commented 2 weeks ago

Thanks!

HewlettPackard / LinuxKI

ER: Continuous run mode #70