jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
https://jaywalnut310.github.io/vits-demo/index.html
MIT License
6.48k stars 1.21k forks source link

RTX4090 training is very slow. Is there something wrong with my parameters? #197

Open Tsangchi-Lam opened 7 months ago

Tsangchi-Lam commented 7 months ago

Hello @jaywalnut310 ubuntu20.04,RTX4090,torch=1.7.1+cu110,torchvision=0.8.2+cu110 Use the project (https://github.com/CjangCjengh/vits) to train Chinese and English models. 4 speakers, each person has 1000 pieces of data, and there are 4000 pieces of data in total. config.json is set as follows

"train": { "log_interval": 200, "eval_interval": 1000, "seed": 1234, "epochs": 10000, "learning_rate": 2e-4, "betas": [0.8, 0.99], "eps": 1e-9, "batch_size": 72, "fp16_run": true, "lr_decay": 0.999875, "segment_size": 8192, "init_lr_ratio": 1, "warmup_epochs": 0, "c_mel": 45, "c_kl": 1.0

Runs for 1 epochs, takes 2 minutes,Now the GPU memory occupies 22GB, what parameters do I need to optimize? Running for 10,000 epochs, doesn’t it take 14 days?

Tsangchi-Lam commented 7 months ago

2023-12-07 17:29:50,116 Model INFO Train Epoch: 1 [0%] 2023-12-07 17:29:50,117 Model INFO [6.107111930847168, 6.105045318603516, 0.269593209028244, 96.44754791259766, 1.584683895111084, 198.7476806640625, 0, 0.0002] 2023-12-07 17:29:58,511 Model INFO Saving model and optimizer state at iteration 1 to ../drive/MyDrive/Model/G_0.pth 2023-12-07 17:29:59,045 Model INFO Saving model and optimizer state at iteration 1 to ../drive/MyDrive/Model/D_0.pth 2023-12-07 17:31:16,886 Model INFO ====> Epoch: 1 2023-12-07 17:32:59,032 Model INFO ====> Epoch: 2 2023-12-07 17:34:39,073 Model INFO ====> Epoch: 3 2023-12-07 17:35:38,354 Model INFO Train Epoch: 4 [45%] 2023-12-07 17:35:38,356 Model INFO [2.7977004051208496, 2.054680585861206, 1.8771229982376099, 33.61065673828125, 1.5511101484298706, 1.837897539138794, 200, 0.00019992500937460937] 2023-12-07 17:36:19,756 Model INFO ====> Epoch: 4 2023-12-07 17:38:00,255 Model INFO ====> Epoch: 5 2023-12-07 17:39:40,198 Model INFO ====> Epoch: 6 2023-12-07 17:41:12,560 Model INFO Train Epoch: 7 [90%] 2023-12-07 17:41:12,561 Model INFO [2.2689309120178223, 2.4255552291870117, 3.8163418769836426, 31.03314781188965, 1.730384349822998, 1.6757980585098267, 400, 0.0001998500468671882] 2023-12-07 17:41:20,501 Model INFO ====> Epoch: 7 2023-12-07 17:43:00,620 Model INFO ====> Epoch: 8 2023-12-07 17:44:40,189 Model INFO ====> Epoch: 9 2023-12-07 17:46:19,541 Model INFO ====> Epoch: 10 2023-12-07 17:47:11,270 Model INFO Train Epoch: 11 [34%] 2023-12-07 17:47:11,272 Model INFO [2.4115347862243652, 2.200401782989502, 2.7906928062438965, 28.269241333007812, 1.5518678426742554, 1.650560736656189, 600, 0.00019975014057813518] 2023-12-07 17:48:00,257 Model INFO ====> Epoch: 11 2023-12-07 17:49:39,658 Model INFO ====> Epoch: 12 2023-12-07 17:51:19,625 Model INFO ====> Epoch: 13 2023-12-07 17:52:44,871 Model INFO Train Epoch: 14 [79%] 2023-12-07 17:52:44,872 Model INFO [2.591888904571533, 1.8898423910140991, 2.6741747856140137, 26.89133644104004, 1.5423401594161987, 1.4998509883880615, 800, 0.00019967524363831608] 2023-12-07 17:53:00,779 Model INFO ====> Epoch: 14 2023-12-07 17:54:40,075 Model INFO ====> Epoch: 15 2023-12-07 17:56:19,358 Model INFO ====> Epoch: 16 2023-12-07 17:57:59,285 Model INFO ====> Epoch: 17 2023-12-07 17:58:43,007 Model INFO Train Epoch: 18 [24%] 2023-12-07 17:58:43,008 Model INFO [2.77396559715271, 1.8405945301055908, 1.951233148574829, 22.942886352539062, 1.5631026029586792, 1.6715075969696045, 1000, 0.00019957542473449108] 2023-12-07 17:58:49,355 Model INFO Saving model and optimizer state at iteration 18 to ../drive/MyDrive/Model/G_1000.pth 2023-12-07 17:58:50,379 Model INFO Saving model and optimizer state at iteration 18 to ../drive/MyDrive/Model/D_1000.pth 2023-12-07 17:59:47,579 Model INFO ====> Epoch: 18 2023-12-07 18:01:27,230 Model INFO ====> Epoch: 19 2023-12-07 18:03:06,809 Model INFO ====> Epoch: 20 2023-12-07 18:04:22,427 Model INFO Train Epoch: 21 [69%] 2023-12-07 18:04:22,428 Model INFO [2.576815366744995, 1.6057857275009155, 2.403277635574341, 22.191165924072266, 1.5969487428665161, 1.3883650302886963, 1200, 0.00019950059330492385] 2023-12-07 18:04:45,199 Model INFO ====> Epoch: 21 2023-12-07 18:06:23,803 Model INFO ====> Epoch: 22 2023-12-07 18:08:01,745 Model INFO ====> Epoch: 23 2023-12-07 18:09:39,967 Model INFO ====> Epoch: 24 2023-12-07 18:10:15,754 Model INFO Train Epoch: 25 [14%] 2023-12-07 18:10:15,756 Model INFO [2.695051670074463, 1.989531397819519, 2.346193552017212, 23.01602554321289, 1.5502030849456787, 1.5479803085327148, 1400, 0.00019940086170989343] 2023-12-07 18:11:18,897 Model INFO ====> Epoch: 25 2023-12-07 18:12:57,246 Model INFO ====> Epoch: 26 2023-12-07 18:14:34,954 Model INFO ====> Epoch: 27 2023-12-07 18:15:42,187 Model INFO Train Epoch: 28 [59%] 2023-12-07 18:15:42,189 Model INFO [2.7265591621398926, 1.9561548233032227, 2.1431448459625244, 22.783016204833984, 1.547501802444458, 1.410009741783142, 1600, 0.00019932609573327815] 2023-12-07 18:16:12,237 Model INFO ====> Epoch: 28 2023-12-07 18:17:49,141 Model INFO ====> Epoch: 29 2023-12-07 18:19:24,823 Model INFO ====> Epoch: 30 2023-12-07 18:21:02,385 Model INFO ====> Epoch: 31 2023-12-07 18:21:29,169 Model INFO Train Epoch: 32 [3%] 2023-12-07 18:21:29,170 Model INFO [2.6338438987731934, 2.0441737174987793, 2.4130072593688965, 21.223384857177734, 1.568681001663208, 1.3795946836471558, 1800, 0.00019922645137067577] 2023-12-07 18:22:39,832 Model INFO ====> Epoch: 32 2023-12-07 18:24:18,001 Model INFO ====> Epoch: 33 2023-12-07 18:25:56,822 Model INFO ====> Epoch: 34 2023-12-07 18:26:57,464 Model INFO Train Epoch: 35 [48%] 2023-12-07 18:26:57,466 Model INFO [2.880666494369507, 1.6881558895111084, 1.9384945631027222, 21.842121124267578, 1.6990931034088135, 1.5140010118484497, 2000, 0.00019915175078976256] 2023-12-07 18:27:03,708 Model INFO Saving model and optimizer state at iteration 35 to ../drive/MyDrive/Model/G_2000.pth 2023-12-07 18:27:04,680 Model INFO Saving model and optimizer state at iteration 35 to ../drive/MyDrive/Model/D_2000.pth 2023-12-07 18:27:43,446 Model INFO ====> Epoch: 35 2023-12-07 18:29:21,761 Model INFO ====> Epoch: 36 2023-12-07 18:31:00,884 Model INFO ====> Epoch: 37 2023-12-07 18:32:32,481 Model INFO Train Epoch: 38 [93%] 2023-12-07 18:32:32,483 Model INFO [2.6397671699523926, 2.1484880447387695, 2.4066004753112793, 20.838167190551758, 1.585195541381836, 1.3150246143341064, 2200, 0.0001990770782180657] 2023-12-07 18:32:37,806 Model INFO ====> Epoch: 38 2023-12-07 18:34:16,107 Model INFO ====> Epoch: 39 2023-12-07 18:35:54,058 Model INFO ====> Epoch: 40 2023-12-07 18:37:31,412 Model INFO ====> Epoch: 41 2023-12-07 18:38:24,803 Model INFO Train Epoch: 42 [38%] 2023-12-07 18:38:24,804 Model INFO [2.743607521057129, 1.878671407699585, 2.666090965270996, 22.306901931762695, 1.6467288732528687, 1.5791008472442627, 2400, 0.0001989775583408775] 2023-12-07 18:39:10,448 Model INFO ====> Epoch: 42 2023-12-07 18:40:48,494 Model INFO ====> Epoch: 43 2023-12-07 18:42:26,751 Model INFO ====> Epoch: 44 2023-12-07 18:43:53,034 Model INFO Train Epoch: 45 [83%] 2023-12-07 18:43:53,035 Model INFO [2.631840705871582, 1.9763622283935547, 2.6052465438842773, 21.16794204711914, 1.568403720855713, 1.5238838195800781, 2600, 0.00019890295108318404] 2023-12-07 18:44:05,703 Model INFO ====> Epoch: 45 2023-12-07 18:45:43,222 Model INFO ====> Epoch: 46 2023-12-07 18:47:21,077 Model INFO ====> Epoch: 47 2023-12-07 18:48:59,074 Model INFO ====> Epoch: 48 2023-12-07 18:49:44,550 Model INFO Train Epoch: 49 [28%] 2023-12-07 18:49:44,552 Model INFO [2.824472427368164, 1.8754892349243164, 2.1163816452026367, 20.517248153686523, 1.6380770206451416, 1.4093304872512817, 2800, 0.00019880351825324018] 2023-12-07 18:50:37,368 Model INFO ====> Epoch: 49 2023-12-07 18:52:14,651 Model INFO ====> Epoch: 50 2023-12-07 18:53:52,570 Model INFO ====> Epoch: 51 2023-12-07 18:55:10,727 Model INFO Train Epoch: 52 [72%] 2023-12-07 18:55:10,729 Model INFO [2.6376609802246094, 1.881882905960083, 2.2403390407562256, 21.22414207458496, 1.6077402830123901, 1.482690691947937, 3000, 0.00019872897625242182] 2023-12-07 18:55:17,149 Model INFO Saving model and optimizer state at iteration 52 to ../drive/MyDrive/Model/G_3000.pth 2023-12-07 18:55:18,094 Model INFO Saving model and optimizer state at iteration 52 to ../drive/MyDrive/Model/D_3000.pth 2023-12-07 18:55:38,389 Model INFO ====> Epoch: 52 2023-12-07 18:57:16,550 Model INFO ====> Epoch: 53 2023-12-07 18:58:54,215 Model INFO ====> Epoch: 54 2023-12-07 19:00:32,366 Model INFO ====> Epoch: 55 2023-12-07 19:01:10,752 Model INFO Train Epoch: 56 [17%] 2023-12-07 19:01:10,754 Model INFO [2.686392068862915, 1.9890084266662598, 2.2291738986968994, 19.04257583618164, 1.3262228965759277, 1.1346114873886108, 3200, 0.00019862963039358455] 2023-12-07 19:02:11,933 Model INFO ====> Epoch: 56 2023-12-07 19:03:50,132 Model INFO ====> Epoch: 57 2023-12-07 19:05:30,888 Model INFO ====> Epoch: 58 2023-12-07 19:06:42,563 Model INFO Train Epoch: 59 [62%] 2023-12-07 19:06:42,564 Model INFO [2.687032461166382, 1.906739354133606, 2.4065237045288086, 20.357139587402344, 1.3602961301803589, 1.2948116064071655, 3400, 0.0001985551535925629] 2023-12-07 19:07:10,816 Model INFO ====> Epoch: 59 2023-12-07 19:08:50,384 Model INFO ====> Epoch: 60 2023-12-07 19:10:28,977 Model INFO ====> Epoch: 61 2023-12-07 19:12:07,249 Model INFO ====> Epoch: 62 2023-12-07 19:12:36,425 Model INFO Train Epoch: 63 [7%] 2023-12-07 19:12:36,426 Model INFO [2.697983741760254, 2.055757999420166, 2.4838271141052246, 19.49764060974121, 1.3826496601104736, 1.3482624292373657, 3600, 0.00019845589462876104] 2023-12-07 19:13:45,168 Model INFO ====> Epoch: 63 2023-12-07 19:15:23,711 Model INFO ====> Epoch: 64 2023-12-07 19:17:02,358 Model INFO ====> Epoch: 65 2023-12-07 19:18:06,132 Model INFO Train Epoch: 66 [52%] 2023-12-07 19:18:06,133 Model INFO [2.775587797164917, 1.9706124067306519, 2.2809641361236572, 18.911941528320312, 1.3059728145599365, 1.311688780784607, 3800, 0.00019838148297050769] 2023-12-07 19:18:41,413 Model INFO ====> Epoch: 66 2023-12-07 19:20:19,845 Model INFO ====> Epoch: 67 2023-12-07 19:21:58,199 Model INFO ====> Epoch: 68 2023-12-07 19:23:35,589 Model INFO Train Epoch: 69 [97%] 2023-12-07 19:23:35,591 Model INFO [2.859001398086548, 2.2468442916870117, 2.420959234237671, 20.528076171875, 1.4777560234069824, 1.2271548509597778, 4000, 0.0001983070992131383] 2023-12-07 19:23:42,057 Model INFO Saving model and optimizer state at iteration 69 to ../drive/MyDrive/Model/G_4000.pth 2023-12-07 19:23:43,223 Model INFO Saving model and optimizer state at iteration 69 to ../drive/MyDrive/Model/D_4000.pth 2023-12-07 19:23:46,231 Model INFO ====> Epoch: 69 2023-12-07 19:25:26,455 Model INFO ====> Epoch: 70 2023-12-07 19:27:06,240 Model INFO ====> Epoch: 71 2023-12-07 19:28:45,809 Model INFO ====> Epoch: 72 2023-12-07 19:29:42,461 Model INFO Train Epoch: 73 [41%] 2023-12-07 19:29:42,463 Model INFO [2.7389588356018066, 1.9250552654266357, 2.162569999694824, 17.6241455078125, 1.3003779649734497, 1.1594269275665283, 4200, 0.00019820796425327303] 2023-12-07 19:30:26,175 Model INFO ====> Epoch: 73 2023-12-07 19:32:04,100 Model INFO ====> Epoch: 74 2023-12-07 19:33:42,472 Model INFO ====> Epoch: 75 2023-12-07 19:35:11,770 Model INFO Train Epoch: 76 [86%] 2023-12-07 19:35:11,772 Model INFO [2.7998061180114746, 1.7896229028701782, 2.256770610809326, 19.311044692993164, 1.2836675643920898, 1.1115658283233643, 4400, 0.00019813364555728923] 2023-12-07 19:35:22,533 Model INFO ====> Epoch: 76 2023-12-07 19:37:01,254 Model INFO ====> Epoch: 77 2023-12-07 19:38:39,568 Model INFO ====> Epoch: 78 2023-12-07 19:40:17,990 Model INFO ====> Epoch: 79 2023-12-07 19:41:06,359 Model INFO Train Epoch: 80 [31%] 2023-12-07 19:41:06,361 Model INFO [2.764493465423584, 1.7720656394958496, 2.2831947803497314, 18.347604751586914, 1.290836215019226, 1.1505413055419922, 4600, 0.00019803459730799195] 2023-12-07 19:41:57,394 Model INFO ====> Epoch: 80 2023-12-07 19:43:36,267 Model INFO ====> Epoch: 81 2023-12-07 19:45:15,289 Model INFO ====> Epoch: 82 2023-12-07 19:46:35,365 Model INFO Train Epoch: 83 [76%] 2023-12-07 19:46:35,366 Model INFO [2.792917251586914, 1.9532325267791748, 2.327437400817871, 17.847885131835938, 1.2713555097579956, 0.9344819188117981, 4800, 0.0001979603436164864] 2023-12-07 19:46:53,460 Model INFO ====> Epoch: 83 2023-12-07 19:48:30,777 Model INFO ====> Epoch: 84 2023-12-07 19:50:07,593 Model INFO ====> Epoch: 85 2023-12-07 19:51:45,527 Model INFO ====> Epoch: 86 2023-12-07 19:52:24,096 Model INFO Train Epoch: 87 [21%] 2023-12-07 19:52:24,098 Model INFO [2.758699655532837, 1.918060302734375, 2.498077154159546, 19.966066360473633, 1.3230218887329102, 1.1922554969787598, 5000, 0.0001978613820019138]

hanshounsu commented 4 months ago

As far as I know, 4090 does not support CUDA 11.0. As a 4090 user, I'm facing a similar problem. Have you maybe solved the problem?