hitachi-speech / EEND

End-to-End Neural Diarization
MIT License
360 stars 57 forks source link

about data prepare #12

Open bbrookie opened 3 years ago

bbrookie commented 3 years ago

Thank you for your open source code When I run _run_prepareshared.sh, I ran into some problems. If I set the nj parameter on line 90 to 100 , the following problems will occur: ~/projects/EEND-master/egs/mini_librispeech/v1/utils/validate_data_dir.sh: no such directory data/simu/data/train_clean_5_ns2_beta2_500 run.pl: job failed, log is in data/simu/.work/random_mixture_train_clean_5_ns2_beta2_500.log utils/split_scp.pl: You are splitting into too many pieces! [reduce $nj (100) to be smaller than the number of lines (5) in data/simu/.work/mixture_train_clean_5_ns2_beta2_500.scp] But if I set nj to 3, I will only get very little mixed audio.

Another problem is that no matter if I set nj to 3 or 100,There will always be an error record in this file(_data/simu/.work/random_mixture_train_clean_5_ns2_beta2500.log),which is: Traceback (most recent call last): File "../../../eend/bin/random_mixture.py", line 123, in <module> rir = rirs[random.choice(all_rirs)] File "/home/tp/anaconda3/envs/EEND/lib/python3.7/random.py", line 261, in choice raise IndexError('Cannot choose from an empty sequence') from None IndexError: Cannot choose from an empty sequence

Can you help me answer how I should set the nj value, and how can I avoid problems in the log?Looking forward to your answer. Thanks!
axuan731 commented 2 years ago

I have the same problem, do u know how to solve it? Thanks!