GPU inference with ONNX

spotify / basic-pitch

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

https://basicpitch.io

Apache License 2.0

3.43k stars 271 forks source link

GPU inference with ONNX #145

Open nateraw opened 1 week ago

nateraw commented 1 week ago

Hi there, it appears it's not possible to do GPU inference with the ONNX provider. It's hard coded to be on CPU. I noticed if I just initialize the onnx model myself and specify the GPU provider, everything works fine and is a lot faster. Any reason why there isn't an option to use GPU with the default inference.py reference code?

Would you welcome a PR for this or is this intentionally disabled?

Thanks again for your work here, folks! ❤️

drubinstein commented 1 week ago

I dont remember. There may have been an ops level issue at the time. If it works fine, please PR a change.

nateraw commented 1 week ago

How would you like me to do it? Easiest way would be to add "CUDAExecutionProvider" as first item in the list here:

https://github.com/spotify/basic-pitch/blob/9991303bba609a3b93089d13ec80d1d495083596/basic_pitch/inference.py#L132

Or can add new param that lets you override if you want the default CPU provider to remain same.

drubinstein commented 1 week ago

I'll leave it up to @rabitt since I'm not at Spotify anymore (nor do I have maintainer privileges on this repo).

I think using CUDAExecutionProvider would be better assuming you have a way to detect whether or not a user has a CUDA capable GPU installed.

nateraw commented 1 week ago

We can check if the CUDAExecutionProvider is available in ort.get_available_providers() then insert it as index 0 in the providers list, otherwise leave it out if we want to be explicit.

Or, I think it just works the way I described above and will fall back to CPU silently...just tried in Colab with the cpu only version of onnx-runtime.

GPU inference will only work if onnxruntime-gpu is installed + you have a GPU.