OlaWod / FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
MIT License
602 stars 111 forks source link

Slow training speed #61

Closed lsw5835 closed 1 year ago

lsw5835 commented 1 year ago

Hi, Is there a reason why learning speed decreases when the amount of learning data increases?

I understand the part where function "train_loader.batch_sampler.set_epoch(epoch)" is affected by datasetsize, but it is difficult to understand that it slows down in the part where the batch is taken from the train loader (in "for batch_idx, items in enumerate(train_loader):"). Is there any other reason to slow down?

Thanks very much.

OlaWod commented 1 year ago

do you mean the convergence speed? more data means more features to learn, just like training a multi-speaker tts would take more time than single-speaker tts. also the data complexity can affect speed too, for example highly emotional data.

lsw5835 commented 1 year ago

I mean training speed. I fully understand what you said, but it currently takes about 30 minutes to train 200 global steps. The number of samples is 400,000 and the 24k model is trained from scratch using one 3090 gpu.

OlaWod commented 1 year ago

normally the training speed per batch won't be affected by dataset size. currently i tend to think that maybe there's something wrong in your training code? for example synthesize input features (e.g. spectrogram, ssl feature) on the fly instead of loading from disk.

yijingshihenxiule commented 1 year ago

Hello, @lsw5835 ,I met the same question. When I trained with my dataset (vctk+ my own), it cost about 30 minutes train 200 global steps, even 40 minutes. I load all data from disk. Have you got the solution?