auspicious3000 / autovc

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
https://arxiv.org/abs/1905.05879
MIT License
983 stars 207 forks source link

Whether the speech on each batch will be crop to a fixed length of time during training? #48

Open qq547276542 opened 4 years ago

qq547276542 commented 4 years ago

For example, do you crop the speech to 2 seconds, or do you keep the original speech length during training? Does this affect the performance of the model? Because I find it very much affects the speed of training, so I'd like to know the answer. Thank you :D

auspicious3000 commented 4 years ago

Neither affects the performance nor the training speed.

qq547276542 commented 4 years ago

Neither affects the performance nor the training speed.

So in your actual training, did you carry out the crop operation?

auspicious3000 commented 4 years ago

Either way works. Did not exceed 3 sec. to fit into memory.

qq547276542 commented 4 years ago

Either way works. Did not exceed 3 sec. to fit into memory.

Thx!