Support exporting and loading ONNX models

kitzeslab / opensoundscape

Open source, scalable software for the analysis of bioacoustic recordings

http://opensoundscape.org

MIT License

137 stars 16 forks source link

Support exporting and loading ONNX models #556

Open sammlapp opened 2 years ago

sammlapp commented 2 years ago

[ ] We should be able to export a model to ONNX so that someone could run predictions using the model without opensoundscape.
[ ] Second, we should be able to load an ONNX model and generate predictions. If this is best done with a simple torch script rather than by implementing something in opensoundscape, that's fine - we can just add documentation of how to do this
[ ] Third, we should be able to load and ONNX model into opensoundscape such that we could re-train (eg, warm-starting) within opensoundscape

It may be necessary or at least logical to use torchaudio to incorporate preprocessing steps into the model, as mentioned in #337

Be aware of the numpy & built-in types caveats for the torch.onnx module

sammlapp commented 2 years ago

Because ONNX will require a numeric vector input, the input will logically be either (a) audio samples or (b) a pre-processed 2d representation of the audio such as a spectrogram, potentially with multiple channels. The advantage of passing the audio sample vector (wav) is that all preprocessing parameters will be included in the model; the only thing the user has to get right is the audio sampling rate. Option (b) gives more flexibility because pre-processing does not need to be packaged into the model, but allows more opportunity for pre-processing operations and parameters to be lost or mis-implemented when the model changes hands.

sammlapp commented 2 years ago

Pytorch support for stft with ONNX is still a work in progress

sammlapp commented 1 year ago

Apparently "torch.onnx.dynamo_export" will add some onnx operators. Also, we could apparently do some custom handling to use implemented onnx functions but I don't fully understand how (see https://github.com/Alexey-Kamenev/tensorrt-dft-plugins/blob/main/tests/test_dft.py#L35 and https://github.com/pytorch/pytorch/issues/81075#issuecomment-1530713416)

Modulus has done something similar https://github.com/NVIDIA/modulus/blob/main/modulus/models/afno/afno.py#L140

If/when we implement something like this, we will need all preprocessing steps (for inference in the exported ONNX model) to be part of the pytorch model, ie layers with forward methods. This raises the question of whether we will end up entirely changing from the use of librosa and scipy to directly using the torchaudio API.

sammlapp commented 1 year ago

apparently now should work!

sammlapp commented 12 months ago

there's a new torch issue to follow for fft export to ONNX: https://github.com/pytorch/pytorch/issues/113067

sammlapp commented 6 months ago

stft + onnx seems to still not be ready (https://github.com/pytorch/pytorch/issues/113067#issuecomment-1892530038)

sammlapp commented 2 months ago

not sure how stable this library is, but it provides stft, spectrogram, and melspectrogram exportable to ONNX and CoreML https://github.com/adobe-research/convmelspec/tree/main

sammlapp commented 3 weeks ago

now follow this: https://github.com/pytorch/pytorch/issues/135087