StanfordLegion / legion

The Legion Parallel Programming System
https://legion.stanford.edu
Apache License 2.0
680 stars 145 forks source link

legion_prof improvements #1551

Open syamajala opened 1 year ago

syamajala commented 1 year ago

Here are some improvements that would be nice to have for the new legion_prof:

  1. Color instances using a heat map based on size so we can easily tell which ones are large and small. From my understanding the coloring is currently random. Ideally they would also be sorted by size vertically as well, but I guess thats not possible unless they are created at the same nanosecond
  2. Web viewer needs error messages when it cant load a profile, either invalid url or bad permissions
  3. Fix permissions for profiles generated on Perlmutter. I guess they are not world readable by default, which is very annoying.
  4. Mode for dumping profile to SQLite so you can write scrape profiles, write post processing scripts, generate additional plots.
syamajala commented 1 year ago

It would be nice if https://legion.stanford.edu/prof-viewer/ was a landing page where you could type in the url to the profile you want to load.

syamajala commented 1 year ago

It would be nice if you could just export plots as PNG from legion_prof. Maybe like how plotly has a little camera in the top right corner of every plot.

suranap commented 11 months ago

I'll add some ideas here.

  1. Some kind of title and/or memo field so people know what this profile is about.
  2. A one-line summary of the hardware config: e.g. 8 nodes, 4 GPUs each, etc.
  3. Some aggregation/query features. e.g. what's the average time spent in task A during this window of time.
  4. A way to output structured log files (e.g. json) so users can convert to alternate log analysis tools if needed.
lightsighter commented 11 months ago

I think (1) was already handled: https://github.com/StanfordLegion/prof-viewer/pull/28

In general, I think we should start to move feature requests like this over to profiler repo (at least the visualization based ones, the ones requesting other output formats can stay here).