YChenL / DS-TDNN

Official implement of "Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification" in PyTorch
https://arxiv.org/pdf/2303.11020v2.pdf
38 stars 6 forks source link

The effect is not ideal #2

Open FiLafs opened 1 year ago

FiLafs commented 1 year ago

I set dim=512 and trained 80 epochs, but the final EER was only 1.31%. After using AS norm, it was also around 1.15%, which did not reach the EER of 0.9% in the paper. Additionally, the size of the model was larger than that in the paper. What is the reason for this?

wcqy-ye commented 9 months ago

I set dim=512 and trained 80 epochs, but the final EER was only 1.31%. After using AS norm, it was also around 1.15%, which did not reach the EER of 0.9% in the paper. Additionally, the size of the model was larger than that in the paper. What is the reason for this?

May I ask if you were able to successfully replicate the results from the paper? I have attempted to replicate them without performing data augmentation, but the EER consistently hovers around 1.4%. I would be very grateful for your assistance if possible. Thank you.

YChenL commented 9 months ago

The paper is still under review. When the paper is accepted, more technical details will be reported in the appendix of the paper. At the same time, the pre-trained model parameters will also be provided in this repo, which can achieve the results reported in this paper.