maum-ai / voicefilter

Unofficial PyTorch implementation of Google AI's VoiceFilter system
http://swpark.me/voicefilter
1.09k stars 227 forks source link

Training setting problem #25

Open Morank88 opened 4 years ago

Morank88 commented 4 years ago

Hi,

Thank you for publishing your code! I am encountering a training problem. As an initial phase I have tried to train only on 1000 samples from LibriSpeech train-clean-100 dataset. I am using the default configuration as was published in your VoiceFilter repo. The only difference is that I used batch size of 6 due to memory limitations. Is it possible that the problem is related to the small batch size that I use?

Another question is related to the generation of the training and testing sets. I have noticed that there is an option to use a VAD for generating the training set but by default it is not used. What is the best practice? to use the VAD or not?

I appreciate your help!

Morank88 commented 4 years ago

I manage to get some progress. Now I training on data from LibriSpeech train-clean-100 and train-clean-360 and testing on train-dev-clean. After 40k steps the SDR reached only to ~5. Is it possible that it is related to the batch size that I am using (6)?

Another question - what is the the learning rate policy? Did you fixed it on 1e-3 throughout the whole training or updated it?

Thanks.

Morank88 commented 4 years ago

Here are tensorboard results: image

natbuk2 commented 4 years ago

Hi Morank88 - did you get any improvement on your SDR? SDR I'm getting is much worse than even you're getting:

image

I've tried a number of runs (firstly, I had smaller batch size to run on lesser GFX card, but the run above was at as-downloaded batch size on amazon EC2 instance with NV100). Only differences are that I made the test sample 1000 (it's 100 in the code, but comment in the Readme mentioned 1000? maybe I'll change it back to the 100 as downloaded, and run again...) - and I have some likely more up-to-date python libraries (couldn't seem to find compatible torch 1.0.1 for example) - any suggestions?

Thanks in advance,

Nat

jianew commented 3 years ago

Hi,

Thank you for publishing your code! I am encountering a training problem. As an initial phase I have tried to train only on 1000 samples from LibriSpeech train-clean-100 dataset. I am using the default configuration as was published in your VoiceFilter repo. The only difference is that I used batch size of 6 due to memory limitations. Is it possible that the problem is related to the small batch size that I use?

Another question is related to the generation of the training and testing sets. I have noticed that there is an option to use a VAD for generating the training set but by default it is not used. What is the best practice? to use the VAD or not?

I appreciate your help!

hi,Can you share your settings, I run the same situation , thanks

yunzhongfei commented 3 years ago

I have also the same question. The best result is 8 of SDR.

zardkk commented 2 years ago

Here are tensorboard results: image

i meet same question with you,how did you solve this?