zyzisyz / mfa_conformer

131 stars 15 forks source link

Torch version #6

Open ductuantruong opened 1 year ago

ductuantruong commented 1 year ago

Hi,

Thank you for publishing your code. However, I am unable to reproduce your result (My EER: 1.24%). I believe using different torch versions lead to the inconsistent results. Therefore, may I ask which torch version you are using in this model.

Once again, thank you for sharing your work!

mosszhd commented 1 year ago

did you merge vox1 and vox2 datasets? are you training on multiple gpus?

ductuantruong commented 1 year ago

did you merge vox1 and vox2 datasets? are you training on multiple gpus?

yes, I did merge vox1&2 and train on multiple gpus. May I ask have you managed to reproduced the result?

mosszhd commented 1 year ago

I could not reproduce the result. I am trying to train it on my pc but i only have one gpu. And the pipeline does not work on a single gpu pc. So, trying to make the necessary changes to train it on my pc. Can you please share the requirement.txt file with me? I have been struggling to install the compatible packages.

ductuantruong commented 1 year ago

Sure. You can my python envs here . However, the torch-related libraries require Cuda environment. I am not sure you can install it on PC.

zyzisyz commented 1 year ago

Hi, this is Yang Zhang. Sorry for the late reply.

Firstly, I want to know how many GPUs do you use to train the model, and what's the number of batch size?

If you want to reproduce my experimental results, here are some suggestions:

ductuantruong commented 1 year ago

Hi Yang Zhang, Thank you for your informative reply. I did reproduce by multi-gpu training with 3 A100 and the batch size of 360. I will try your suggestion. Once again, thank you for sharing your code and supporting us!

mosszhd commented 1 year ago

Sure. You can my python envs here . However, the torch-related libraries require Cuda environment. I am not sure you can install it on PC.

thanks for sharing your env.

ductuantruong commented 1 year ago

Hi Yang Zhang,

Thank to your suggestion, I have obtained 0.78% EER which is closer to your published score. I have one more question: whether your published result of 0.64% EER is after or before you perform average the checkpoint weights? Since in the training code, you didn't include average the checkpoint weights.

Once again, thank you for your help!

wcqy-ye commented 9 months ago

Hi Yang Zhang,

Thank to your suggestion, I have obtained 0.78% EER which is closer to your published score. I have one more question: whether your published result of 0.64% EER is after or before you perform average the checkpoint weights? Since in the training code, you didn't include average the checkpoint weights.

Once again, thank you for your help! May I ask how you replicated the results of this experiment? I tried reproducing the results on a V100 machine (without modifying any other parameters), but my EER (Equal Error Rate) has consistently been 1.07%. I did not combine Vox1 and Vox2; I only used Vox2 as the training set. I would appreciate your assistance and want to express my gratitude.

ductuantruong commented 9 months ago

Hi @wcqy-ye

I tried to reproduce the result by training this model on the combined Vox1 and Vox2 train set. If you only trained it on the Vox2 train set, the EER of 1.07% is reasonable. In this paper https://arxiv.org/pdf/2305.14778.pdf, they also reproduced MFA-Conformer's result using just the Vox2 train set and achieved an EER of 0.99% which is quite close to yours.

Hopefully, this information is helpful!

wcqy-ye commented 9 months ago

I truly appreciate your response. I've also attempted to replicate the results on the 4080 machine by combining the Vox1 and Vox2 training sets. The replicated Equal Error Rate (EER) was around 1.0%. It's possible that due to the limited 16GB VRAM on the 4080, I had to modify the batch size. I plan to try again on a machine with 24GB VRAM, such as the 4090 or 3090, to see if the results differ. Once again, thank you so much for your assistance.

Hi @wcqy-ye

I tried to reproduce the result by training this model on the combined Vox1 and Vox2 train set. If you only trained it on the Vox2 train set, the EER of 1.07% is reasonable. In this paper https://arxiv.org/pdf/2305.14778.pdf, they also reproduced MFA-Conformer's result using just the Vox2 train set and achieved an EER of 0.99% which is quite close to yours.

Hopefully, this information is helpful!

wcqy-ye commented 9 months ago

Hi Yang Zhang,

Thank to your suggestion, I have obtained 0.78% EER which is closer to your published score. I have one more question: whether your published result of 0.64% EER is after or before you perform average the checkpoint weights? Since in the training code, you didn't include average the checkpoint weights.

Once again, thank you for your help!

Hello, may I ask if you used AS-norm during the reproduction process? I noticed in the readme.md that using AS-norm could improve the performance, but it seems that I didn't see its usage in the code (or maybe I missed it, please correct me). Did you use it during the reproduction, and if so, where was it applied? Thanks again for your previous responses.

ductuantruong commented 9 months ago

Hi @wcqy-ye,

If I am not wrong, the normalization is already included in the code already. You can find it at line 138 main.py