Closed saitanay closed 1 year ago
@saitanay You were right! It was only using CPU backend. I had created it as a demo, never really intended to deploy anywhere. Anyways, considering how much interest it triggered lately, I fixed it https://github.com/thammegowda/nllb-serve/commit/846541b022ff2573e4fcdaeaae3d695d829389aa
Please check if the latest code works and close this issue if fixed or let me know if there are more issues.
Yes, working as expected, and leveraging GPU for REST API as well.
And the API is super fast compared to earlier. Love it!
I have had this setup on Google Colab with GPU accelerated runtime. But it doesnt seem to leverage the GPU while using POST request.
Is my understanding correct that only in batch mode when used via console, it uses CUDA/GPU?