rhasspy / piper

A fast, local neural text to speech system
https://rhasspy.github.io/piper-samples/
MIT License
4.38k stars 297 forks source link

High inference memory usage #484

Open siddhatiwari opened 2 weeks ago

siddhatiwari commented 2 weeks ago

If a piper http server comes under heavy load, GPU memory usage can spike up multiple GBs and remain high until the server is stopped. Sometimes requests can get OOM errors if memory usage increases too much.

I'm not sure if these are bugs or expected behaviors:

To recreate, run the http_server and serve it high requests per second: python3 -m piper.http_server -m en_US-lessac-medium --cuda --port 6000