Closed talka1 closed 3 years ago
@talka1 Sorry for this late reply. Are you using some custom datasets?
Yes i created my own dataset of two speakers, each with 90 wav files. What could possibly cause this problem? I guess there are too many values being unpacked to the ammount of variables that are specified to it right? In this case the eval_wav files?
@talka1 If you run existing example recipes (e.g. vcc2018, vcc2020), you will find the converted filenames of the format <number>_org-<orgspk>_cv-<tarspk>.wav
. Please check if the filenames are of the same format.
Yes this worked.
All my wav files were named like:
"spkr1_000.wav"... and "spkr2_000.wav"...
So it generated:
"spkr1_085_org-spkr2_cv-spkr1.wav"
instead of:
"085_org-spkr2_cv-spkr1.wav"
I had to rename this mistake in all the files and also all generated Files made in all the Stages.
Now it gave me an mcd calculation result:
# python -m crank.bin.evaluate_mcd --conf conf/mlfb_vqvae.yml --n_jobs 10 --spkr_conf conf/spkr.yml --outwavdir exp/mlfb_vqvae/eval_PWG_wav/200000/wav --featdir data/feature
# Started at Sat Mar 20 12:46:53 UTC 2021
#
2021-03-20 12:46:55,748 (evaluate_mcd:117) INFO: number of utterances = 22
spkr1 spkr1 7.458
spkr1 spkr2 11.707
spkr2 spkr1 11.223
spkr2 spkr2 7.358
# Accounting: time=122 threads=1
# Ended (code 0) at Sat Mar 20 12:48:55 UTC 2021, elapsed time 122 seconds
But the MOSnet score prediction failed.
# python -m crank.bin.evaluate_mosnet --outwavdir exp/mlfb_vqvae/eval_PWG_wav/200000/wav
# Started at Sat Mar 20 12:48:55 UTC 2021
#
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/content/drive/MyDrive/crank/crank/bin/evaluate_mosnet.py", line 15, in <module>
import speechmetrics
ModuleNotFoundError: No module named 'speechmetrics'
# Accounting: time=1 threads=1
# Ended (code 1) at Sat Mar 20 12:48:56 UTC 2021, elapsed time 1 seconds
So do I need to install a package called "speechmetrics"?
EDIT: Fixed it by installing the repository speechmetrics
For CPU:
pip install git+https://github.com/aliutkus/speechmetrics#egg=speechmetrics[cpu]
For GPU:
pip install git+https://github.com/aliutkus/speechmetrics#egg=speechmetrics[gpu]
These are the Mosnet Score Prediction:
# python -m crank.bin.evaluate_mosnet --outwavdir exp/mlfb_vqvae/eval_PWG_wav/200000/wav
# Started at Sat Mar 20 13:23:29 UTC 2021
#
2021-03-20 13:23:29,411 (evaluate_mosnet:40) INFO: number of utterances = 22
Loaded speechmetrics.absolute.mosnet
2021-03-20 13:23:32,553 (driver:121) INFO: Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2021-03-20 13:23:32,577 (driver:121) INFO: Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
spkr1 spkr1 2.900
spkr1 spkr2 2.577
spkr2 spkr1 2.814
spkr2 spkr2 2.502
# Accounting: time=14 threads=1
# Ended (code 0) at Sat Mar 20 13:23:43 UTC 2021, elapsed time 14 seconds
stage 1 - 6 successfully worked. Stage 7 failed.
mcd log says following:
Whats is happening here?