jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
https://jaywalnut310.github.io/vits-demo/index.html
MIT License
6.48k stars 1.21k forks source link

Training time too long #184

Open newton2149 opened 10 months ago

newton2149 commented 10 months ago
Loading train data:   6%|█▊                          | 25/395 [01:08<08:03,  1.31s/it]INFO:ljs_base:Train Epoch: 6 [6%]
INFO:ljs_base:[2.675187349319458, 1.905822992324829, 2.1355035305023193, 27.08325958251953, 1.6539193391799927, 1.324233889579773, 2000, 0.0001997751124671936]
DEBUG:matplotlib:matplotlib data path: /opt/conda/envs/vits/lib/python3.11/site-packages/matplotlib/mpl-data
DEBUG:matplotlib:CONFIGDIR=/home/ubuntu/.config/matplotlib
DEBUG:matplotlib:interactive is False
DEBUG:matplotlib:platform is linux
INFO:ljs_base:Saving model and optimizer state at iteration 6 to ./logs/ljs_base/G_2000.pth
INFO:ljs_base:Saving model and optimizer state at iteration 6 to ./logs/ljs_base/D_2000.pth
Loading train data:  26%|███████                    | 103/395 [02:50<04:48,  1.01it/s]

I am trying to train with the LJ Speech Dataset in aws ec2 instance.

Using p3.2xlarge instance It is kind of very slow. Is there any kind of optimization I can do?

And while training how much time did it take for you?