kitzeslab / opensoundscape

Open source, scalable software for the analysis of bioacoustic recordings
http://opensoundscape.org
MIT License
139 stars 16 forks source link

save version-agnostic preprocessing for CNN spectrogram inference #963

Closed sammlapp closed 1 month ago

sammlapp commented 8 months ago

Though lots of complicated stuff can happen in CNN training, inference on spectrograms has a finite number of parameters. Let's define a json format that allows us to save and load preprocessing, assuming that you're loading audio files, making a spectrogram, then converting to torch.tensor - and not doing anything outside of the norm.

API will be something like

m=CNN(...)
m.save_inference_package(/folder/)
# saves /folder/weights.pt, /folder/model.json
# with architecture and preprocessor info 
CNN.load_inference_package(/folder/)
# expects exactly 1 .pt and 1 .json file in the folder
# creates architecture, loads weights, and regenerates preprocessing operations

@louisfh has requested this in the past and I was hesitant because of the infinite number of possible additions, but the reality is there is a standard set of preprocessing operations for inference that we can support I/O for

if moving preprocessing into the actual torch model is supported, this request may be obscelete

louisfh commented 8 months ago

This would be great! I think the potential problem of people having e.g. custom preprocessing actions their model depends on is an edge case, but there will probably be lots of people who want to develop a model, and then be able to model.predict with that same model object in a newer opso environment.