gusrud1103 / LibriPhrase

Recipe for LibriPhrase
MIT License
23 stars 4 forks source link

KeyError: 'anchor_spk' #3

Open coding-dallas opened 1 year ago

coding-dallas commented 1 year ago

I am trying to prepare libriphrase dataset from librispeech test clean. for up to 2 word class, it is working fine. when extracting 3 word class, it is throwing the above error.

below are the parameter i passed for preparing the dataset.

python3 libriphrase.py --libripath '/asr3/kesav/keywrd_file_prep/LibriPhrase/LibriPhrase_test_clean/data/LibriSpeech_clean_wav/' --newpath '/asr3/kesav/keywrd_file_prep/LibriPhrase/LibriPhrase_test_clean/data/LibriPhrase_diffspk_all/' --wordalign '/asr3/kesav/keywrd_file_prep/LibriPhrase/LibriPhrase_test_clean/metadata/librispeech_clean_test_all_utt_with_flac.csv' --output '/asr3/kesav/keywrd_file_prep/LibriPhrase/LibriPhrase_test_clean/metadata/librispeech_clean_test_short_phrase.csv' --numpair 3 --maxspk 1611 --maxword 4 --mode 'diffspk_all'

gusrud1103 commented 1 year ago

Hi, it seems you should use librispeech train-other-500h data for making LibriPhrase evalution set in principle. If you want to use librispeech test-clean, you need to reduce maxspk since the number of speakers from test-clean is smaller than train-other-500h.

coding-dallas commented 1 year ago

i am not clear what the parameters ( --numpair, --maxsp) tells