Store pytorch-lightning profiling output

GeorgePearse commented 2 years ago

🚀 Feature

pytorch-lightning have this nice profiling tool. I'd like to store this with the rest of my experiment tracking, haven't seen many tools doing that well. https://pytorch-lightning.readthedocs.io/en/1.6.1/advanced/profiler.html

Ideally you'd want it to log the output of the first few epochs, and not wait until the training run is complete.

Motivation

This is the sort of thing where you want to be able to look back over a lot of runs and see what feature caused certain slowdowns etc.

And how it compares on different machines etc.

Pitch

Seems like a feature that fits very well in the experiment tracking API, but it's often missing from other providers. Not super easy to write yourself because of the way the pytorch-lightning profiling outputs after fit.

Additional context

Might make sense to have a generic table output and build this on top of that if it doesn't already exist. Tensorboard has a similar implementation, will go looking for it, link it here, and try it out.

GeorgePearse commented 2 years ago

Think an "Aim" profiler could be written, could either feed it the logger, or specify the host, first feels cleaner?

GeorgePearse commented 2 years ago

Here's tensorboard version https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html

aimhubio / aim