Xflick / EEND_PyTorch

A PyTorch implementation of End-to-End Neural Diarization
MIT License
98 stars 16 forks source link

Question about experience result #14

Open SoundingSilence opened 1 year ago

SoundingSilence commented 1 year ago

Thank you for your PyTorch implementation of SA-EEND. When I Infer the callhome2-2spk dataset, I find that I cannot correspond my results to your experience result table in README.md. I use the _'modelcallhome.th' as initialization. The results are as below.

  1. th=0.3, med=1, 5, 9, 11: DER= 14.19, 13.19, 12.47, 12.32
  2. th=0.4, med=1, 5, 9, 11: DER= 13.04, 11.96, 11.09, 10.97
  3. th=0.5, med=1, 5, 9, 11: DER= 13.18, 12.24, 11.22, 11.21
  4. th=0.6, med=1, 5, 9, 11: DER= 14.14, 13.28, 12.37, 12.26
  5. th=0.7, med=1, 5, 9, 11: DER= 15.85, 14.93, 13.92, 13.87

In that way, which line of table in README is '_modelcallhome.th' correspond to? "PyTorch" or "PyTorch*"? If it corresponds to "PyTorch", the best DER result is 10.97, rather than 11.21

I am looking forward to your reply. Thanks!

Xflick commented 1 year ago

It should match the result "PyTorch". And my result follows the setup from their TASLP paper, where th=0.5 and med=11: image

SoundingSilence commented 1 year ago

Thanks for your reply. Based on the your default config setting, I have reproduced the experience results of 2-layer and 4-layer SA-EEND, which are consistent with Paper. However, when I try to incorporate the Encoder-Decoder-Attractor module into the SA-EEND, I failed to reproduce the results as in EDA-EEND paper. On the fix-2spk setting, I only get 8.95% and 4.4% on simu-dev-beta2-2spk-500 and callhome2_spk2, which are far below the results listed in paper(8.07% and 2.69% respectively), even though I have used the identical model structure and training config. Do you have any suggestion based on your rich experience? Thanks!

eda-eend-2spk