Closed aedocw closed 9 months ago
How do I tell if deepspeed is being properly utilized? Do I infer it from the estimated time?
There will be some additional output before text is sent to be read, should look something like:
Loading model: /home/doc/.local/share/tts/adamwhite
> tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
> Using model: xtts
[2024-01-03 07:15:48,400] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-03 07:15:48,644] [INFO] [logging.py:96:log_dist] [Rank -1] DeepSpeed info: version=0.12.6, git-hash=unknown, git-branch=unknown
[2024-01-03 07:15:48,645] [WARNING] [config_utils.py:69:_process_deprecated_field] Config parameter replace_method is deprecated. This paramete
r is no longer needed, please remove from your call to DeepSpeed-inference
then right before the first reading block you will see this:
Computing speaker latents...
Reading from 4 to 23
0%| | 0/5 [00:00<?, ?it/s]
------------------------------------------------------
Free memory : 3.794922 (GigaBytes)
Total memory: 7.999512 (GigaBytes)
Requested memory: 0.335938 (GigaBytes)
Setting maximum total tokens (input + output) to 1024
WorkSpace: 0x7d3400000
------------------------------------------------------
Since November '23, Coqui XTTS model has supported deepspeed. Would be great to add it, can get 3x-4x speed improvement!