Ekumen-OS / lambkin

Apache License 2.0
11 stars 0 forks source link

Truncated timemory jsons #111

Open glpuga opened 3 weeks ago

glpuga commented 3 weeks ago

Bug description

Running 24hr bagfiles with timememory profiling on ends up with broken records because the recorded timememory output jsons get truncated and become unloadable.

Manually fixing them by emtying (not removing) the "history" section makes them usable again.

An untested solution may be to increase the sigterm and sigkill timeouts in ros2 launch

Platform (please complete the following information):

How to reproduce

List steps to reproduce the issue:

  1. ...

Code snippets or minimal examples are always helpful, if not necessary.

Expected behavior A clear and concise description of what you expected to happen.

Actual behavior A clear and concise description of what you actually happened.

Additional context

Any other information you think could be meaningful to this issue.

hidmic commented 1 week ago

An untested solution may be to increase the sigterm and sigkill timeouts in ros2 launch

Yeah. It's brittle but it's an option. The actual solution would be for the profiler to do incremental writes to storage. Considering https://github.com/NERSC/timemory has been recently archived, perhaps there are other profiling tools we can use.

glpuga commented 1 week ago

The information we want from the json is actually volunteered by timememory in the on the standard output, can we get it form there instead?

hidmic commented 1 week ago

The information we want from the json is actually volunteered by timememory in the on the standard output, can we get it form there instead?

Memory might be failing me, but I think we did not do that because it made it hard to separate timem output from that of the underlying process.