wangkenpu / rsrgan

Robust Speech Recognition Using Generative Adversarial Networks (GAN)
MIT License
58 stars 16 forks source link

mulit-GPU Performs worse than single-GPU #4

Open Rpersie opened 5 years ago

Rpersie commented 5 years ago

Dear wang, First, thanks for your code. I use your code to do experiments, but I find the mulit-GPU Performs worse than single-GPU. Do you suffer from this problem? Or can you give me some advices ? I just have to change the batch size and learning rate ?

Thank you !

wangkenpu commented 5 years ago

I also found multi-GPU performed worse than single-GPU. In my experiments, I always used single-GPU to train the model. I'm not sure how to fix this problem.

Rpersie commented 5 years ago

Thank you for response ! I will try to change the batch size and learning rate to do some experiments.

In your paper the Acoustic model is fixed. In the table2 and table3, your featuremapping results is better than not do featuremapping in the MCT condition. I do not understand why it is better. If do the featuremapping and acoustic model joint training, it may become better than the MCT training. But not do the featuremapping, the eval feats are more similar to acoustic training sets.

Thanks!

wangkenpu commented 5 years ago

In my opinion, front-end enhancement can make noisy or reverberant speech more similar with clean speech. While we don't make sure MCT can do the same thing. If the input speech is contaminated severely, maybe MCT can't bring significant improvement, but front-end enhancer can do feature enhancement.

Of course, joint-training is very helpful. But I just have tried it using KALDI. The experiments are excluded from my paper.