coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
35.66k stars 4.37k forks source link

why xtts v2 inference time used RAM double(or more 3x) then GPU or VRAM #3976

Open saiful9379 opened 3 months ago

saiful9379 commented 3 months ago

xtts_issue

Describe the bug

For the example when model loading the RAM required close to 5 GB and VRAM use 2.1 GB. How can i reduce RAM uses for loading the model infernce fime. basically i try to figure out which is the issue for taking more RAM. Here i found when i initialize the GPT block then this model used closed to 5 GB RAM. this RAM is not GPU memory.

To Reproduce

Inference used RAM : 4634.7890625

Expected behavior

Expected low RAM use when inference

Logs

No response

Environment

- python==3.10
- torch                     2.2.1+cu121              pypi_0    pypi
- torchaudio                2.2.1+cu121              pypi_0    pypi
- deepspeed                 0.10.3                   pypi_0    pypi

Additional context

No response

### Tasks
stale[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.