Doesn't use GPU when hosted on Google Colab with GPU acceleration

thammegowda / nllb-serve

Meta's "No Language Left Behind" models served as web app and REST API

http://rtg.isi.edu/nllb/

151 stars 20 forks source link

Closed saitanay closed 1 year ago

saitanay commented 1 year ago

I have had this setup on Google Colab with GPU accelerated runtime. But it doesnt seem to leverage the GPU while using POST request.

Is my understanding correct that only in batch mode when used via console, it uses CUDA/GPU?

thammegowda commented 1 year ago

@saitanay You were right! It was only using CPU backend. I had created it as a demo, never really intended to deploy anywhere. Anyways, considering how much interest it triggered lately, I fixed it https://github.com/thammegowda/nllb-serve/commit/846541b022ff2573e4fcdaeaae3d695d829389aa

Please check if the latest code works and close this issue if fixed or let me know if there are more issues.

saitanay commented 1 year ago

Yes, working as expected, and leveraging GPU for REST API as well.

And the API is super fast compared to earlier. Love it!