running evaluation - Githubissues

shahryarghayoor commented 2 years ago

Hi Dr. Xu Li

I could fortunately run the training command and successfully run "start.sh" command. now that I got the output as txt file I am running your "err" command to get the output and here are the results:

scoring/evaluate_tDCF_asvspoof19.py:36: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations asv_scores = asv_data[:, 2].astype(np.float) scoring/evaluate_tDCF_asvspoof19.py:43: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations cm_scores = cm_data[:, 3].astype(np.float) 0.024604674888228765 0.024581005586592177 0.23934754703985472 t-DCF evaluation from [Nbona=7355, Nspoof=63882] trials

t-DCF MODEL Ptar = 0.94050 (Prior probability of target user) Pnon = 0.00950 (Prior probability of nontarget user) Pspoof = 0.05000 (Prior probability of spoofing attack) Cfa_asv = 10.00000 (Cost of ASV falsely accepting a nontarget) Cmiss_asv = 1.00000 (Cost of ASV falsely rejecting target speaker) Cfa_cm = 10.00000 (Cost of CM falsely passing a spoof to ASV system) Cmiss_cm = 1.00000 (Cost of CM falsely blocking target utterance which never reaches ASV)

Implied normalized t-DCF function (depends on t-DCF parameters and ASV errors), s=CM threshold) tDCF_norm(s) = 2.40595 x Pmiss_cm(s) + Pfa_cm(s)

ASV SYSTEM EER = 2.45778 % (Equal error rate (target vs. nontarget discrimination) Pfa = 2.46047 % (False acceptance rate of nontargets) Pmiss = 2.45810 % (False rejection rate of targets) 1-Pmiss,spoof = 76.06525 % (Spoof false acceptance rate)

CM SYSTEM EER = 13.10529 % (Equal error rate for countermeasure)

TANDEM

I am a bit confused at this stage. I appreciate if you can help me what should I do now. This is the command to compute both system EER and t-DCF: python scoring/evaluate_tDCF_asvspoof19.py scoring/la_asv_scores/ASVspoof2019.LA.asv.eval.gi.trl.scores.txt cm_scores/seres2net50_26w_8s-epoch11-eval_scores.txt

Am I getting the right results? what can I do to get visual output? and use it in a paper? Thanks

lixucuhk commented 2 years ago

May I know your training configurations, e.g. what acoustic features you are using? It seems that the network setting is 26w with 8s, which is not identical to the one in the paper. Have you ever tried other settings?

shahryarghayoor commented 2 years ago

well for features I have to say I have 3 seperate folders: two containing cqt, spec and one in lfcc that I extracted using matlab and the output is ".txt" files.

I start your "start.sh" file and one error occurs: FileNotFoundError: [Errno 2] No such file or directory: 'data/debug_samples/feats_slicing.scp'

but when I try different configs I get output without any problems. For example, This the new config when I change the values: randomseed=0 # 0, 1, 2, ... config=conf/training_mdl/seres2net50_26w_8s.json # configuration files in conf/training_mdl feats= pa_cqt # pa_lfcc, la_spec, la_cqt or la_lfcc runid=seres2net50_26w_8s

would you please tell me what is the meaning of these 4 lines separately? Thanks

randomseed=0 # 0, 1, 2, ... config=conf/training_mdl/seresnet34.json # configuration files in conf/training_mdl feats=debug_feats #, pa_spec, pa_cqt, pa_lfcc, la_spec, la_cqt or la_lfcc runid=SEResNet34Debugfeats0

lixucuhk commented 2 years ago

Hi shahryarghayoor, 1) the randomseed option provides a random seed for initialization issues; 2) the config option indicates the training model configuration file, just as the one you used (conf/training_mdl/seresnet34.json); 3) the feats option provides the feature to be used during training. The debug_feats is only used for debug, so you cannot use it when you train your own models. Instead, you can use other alternatives, e.g. "pa_spec"; Here, the "pa" means the PA partition, while the "spec" means using the Spectrogram as the feature. 4) The runid actually serves as a tag function. It creates different folders for different training settings.

Sorry for the late reply. Thank you!

shahryarghayoor commented 2 years ago

Hi shahryarghayoor,

the randomseed option provides a random seed for initialization issues;

the config option indicates the training model configuration file, just as the one you used (conf/training_mdl/seresnet34.json);

the feats option provides the feature to be used during training. The debug_feats is only used for debug, so you cannot use it when you train your own models. Instead, you can use other alternatives, e.g. "pa_spec"; Here, the "pa" means the PA partition, while the "spec" means using the Spectrogram as the feature.

The runid actually serves as a tag function. It creates different folders for different training settings.

Sorry for the late reply. Thank you!

Thank you so much for your answer. what you have described was clear and I understood well except random seed.

Actually, I'm a bit confused because most of the time this error occurs: RuntimeError: CUDA out of memory. Tried to allocate 380.00 MiB (GPU 0; 7.80 GiB total capacity; 5.35 GiB already allocated; 342.81 MiB free; 5.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

Once I got the output I need to organize the results exactly as you did. So what are the precedure? Should I do the following steps?

1- start the "start.sh" file with seres2net50_26w_8s and la_spec 2- start the "start.sh" file with seres2net50_26w_8s and la_cqt 3- start the "start.sh" file with seres2net50_26w_8s and la_lfcc 4- (Do Exactly the same for PA features) then 5- For la_spec, compare the 20th epochs of each DEV and EVAL sets to calculate ERR and t-DFC 6- For la_cqt, compare the 20th epochs of each DEV and EVAL sets to calculate ERR and t-DFC 7- For la_lfcc, compare the 20th epochs of each DEV and EVAL sets to calculate ERR and t-DFC 8- (Do exactly the same for PA features) then 9- At this step which is the last one, I have to compare six LA and PA features together.

Would you tell me if I am right and is it the procedure?

lixucuhk commented 2 years ago

Yes, you are right. This is the exact procedure. For the "CUDA out of memory" error, you need to use a GPU with larger memory. Thank you!

lixucuhk / ASV-anti-spoofing-with-Res2Net

running evaluation #5