ML4GW / aframev2

Detecting binary black hole mergers in LIGO with neural networks
MIT License
5 stars 16 forks source link

Support for `TorchScript` exporting of `aframe` model #282

Closed EthanMarx closed 1 week ago

EthanMarx commented 2 weeks ago
  1. Makes export platform configurable on the law side
  2. Updates the export project to be compatible with TorchScript
  3. Removes hermes as submodule
  4. Binds AFRAME_ env vars corresponding to directories so that users can point to one anothers files
  5. Fixes issue in recently added WandbSaveConfig callback
  6. Updates tensorrt in export to 8.5.2.2 to be compatible with triton 23.01

There is a nasty memory leak in older triton containers 22.XX when making multi-gpu inference requests with a model exported and hosted with libtorch backend (TorchScript). These went away in 23.01.

@wbenoit26 Ready to look over. I snuck in a couple other fixes here, happy to strip them out into separate PRs if that's easier

EthanMarx commented 2 weeks ago

@wbenoit26 this is ready to take a look at - before it's merged, I want to do a little test run directly comparing TensorRT and TorchScript export

wbenoit26 commented 2 weeks ago

And you can also remove hermes from the known_third_party list

EthanMarx commented 1 week ago

@wbenoit26 Added sample_rate and kernel_length as arguments to all models since the S4 requires this before runtime. These are linked from the data module via the cli