Voice-Privacy-Challenge / Voice-Privacy-Challenge-2024

Baseline Recipe for VoicePrivacy Challenge 2024: anonymization systems and evaluation software
Other
40 stars 2 forks source link

Difficulty calculating EER #37

Closed jacobjwebber closed 6 months ago

jacobjwebber commented 7 months ago

Hi! Thanks so much for these amazing resources.

I am having some trouble calculating an EER score. My results are exactly the same as those for McAdam (which are also the same as for the other baseline).

Am I doing something wrong?

---- ASV_eval results ----
  dataset split gender enrollment     trial     EER
0   libri   dev      f   original  original  10.511
3   libri   dev      m   original  original   0.931
6   libri  test      f   original  original   8.761
9   libri  test      m   original  original   0.418

Thanks!

TonyWangX commented 7 months ago

I think it should be the same.

In the "enrollment original - trial original" pairs, the original data without anonymization is used to compute the EERs. So, the results should be irrelevant to your methods.

It seems to be unnecessary to compute the original-original EERs, but it may be good for sanity checks. In some cases, even the original-original EERs may slightly vary due to irrelevant factors (e.g., software dependency, ...). Being "exactly the same" indicates that the software environment has been correctly set up : )