When I deploy the model with torchserve, why is the video memory usage much larger than directly loading the model with pytorch? The works and batchsize have been set to 1.

pytorch / serve

Serve, optimize and scale PyTorch models in production

https://pytorch.org/serve/

Apache License 2.0

4.23k stars 864 forks source link

Closed Git-TengSun closed 2 years ago

msaroufim commented 2 years ago

What's the exact overhead you're seeing? The overhead might be attributed to the metric_collector.py which frequently queries the GPU