Open mim opened 7 years ago
Can I have writing permission in your directory? ...
Why don't you copy /scratch/mim/jsalt/code/asr/kaldi-jsalt
to your own /scratch
directory, because you might need to recompile it, etc. It's 14GB, but we have plenty of storage space.
Got it.
I still need your permission to copy all the stuffs. Is there a option to change the permission for copying?
It looks like you have read permissions on all files and directories and execute permissions on all directories, which is what you should need. Which directory/file are you getting an error on?
I got an error like this just now:
cp: cannot access '/scratch/mim/jsalt/code/asr/kaldi-jsalt/tools/openfst-1.3.4': Permission denied
Ok, I just modified the permissions on everything in /scratch/mim/jsalt/code/asr/kaldi-jsalt
, so try it again now.
Nice, all things copied.
I got an error like this:
loadtxt_ram()
1-grams: reading 4989 entries
done level 1
2-grams: reading 1639687 entries
done level 2
3-grams: reading 2684151 entries
done level 3
done
starting to use OOV words [<unk>]
OOV code is 4989
OOV code is 4989
OOV code is 4989
pruning LM with thresholds:
1e-07 1e-07
ng: \<s\> 0 nextlevel_ts=1.99968 nextlevel_tbs=0.931817 k=1 ns=4206
savetxt: /scratch/near/kaldi-jsalt/egs/chime3/s5/data/local/nist_lm/lm_tgpr_5k.arpa
save: 4989 1-grams
save: 473820 2-grams
save: 656493 3-grams
done
Data preparation succeeded
Checked out revision 13265.
Dictionary preparation succeeded
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> data/local/dict/silence_phones.txt is OK
Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> data/local/dict/optional_silence.txt is OK
Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> data/local/dict/nonsilence_phones.txt is OK
Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.
Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> data/local/dict/lexicon.txt is OK
Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]
**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstcompile: error while loading shared libraries: libfstscript.so.1: cannot open shared object file: No such file or directory
fstarcsort: error while loading shared libraries: libfstscript.so.1: cannot open shared object file: No such file or directory
+ nj=30
+ enhan=enhanced
+ enhan_data=/scratch/near/CHiME3/v2/replayMessl/average/
+ '[' '!' -d data/lang ']'
+ local/real_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
local/real_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
cat: tr05_real.dot: No such file or directory
cat: tr05_real.dot: No such file or directory
cat: dt05_real.dot: No such file or directory
cat: dt05_real.dot: No such file or directory
cat: et05_real.dot: No such file or directory
cat: et05_real.dot: No such file or directory
Data preparation succeeded
+ local/simu_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
local/simu_enhan_chime3_data_prep.sh enhanced /scratch/near/CHiME3/v2/replayMessl/average/
cat: dt05_simu.dot: No such file or directory
cat: dt05_simu.dot: No such file or directory
cat: et05_simu.dot: No such file or directory
cat: et05_simu.dot: No such file or directory
Data preparation succeeded
+ mfccdir=mfcc/enhanced
+ for x in 'dt05_real_$enhan' 'et05_real_$enhan' 'tr05_real_$enhan' 'dt05_simu_$enhan' 'et05_simu_$enhan' 'tr05_simu_$enhan'
+ steps/make_mfcc.sh --nj 10 --cmd 'queue.pl -l arch=*64* -q all.q' data/dt05_real_enhanced exp/make_mfcc/dt05_real_enhanced mfcc/enhanced
steps/make_mfcc.sh --nj 10 --cmd queue.pl -l arch=*64* -q all.q data/dt05_real_enhanced exp/make_mfcc/dt05_real_enhanced mfcc/enhanced
utils/validate_data_dir.sh: Error: in data/dt05_real_enhanced, utterance lists extracted from utt2spk and text
utils/validate_data_dir.sh: differ, partial diff is:
1,1640d0
< F01_050C0101_PED_REAL
< F01_050C0102_CAF_REAL
< F01_050C0102_STR_REAL
< F01_050C0103_BUS_REAL
< F01_050C0103_STR_REAL
...
< M04_423C0211_PED_REAL
< M04_423C0212_BUS_REAL
< M04_423C0213_CAF_REAL
< M04_423C0214_STR_REAL
< M04_423C0215_STR_REAL
< M04_423C0216_BUS_REAL
[Lengths are kaldi.oYmm/utts=1640 versus kaldi.oYmm/utts.txt=0]
+ nj=30
+ enhan=enhanced
+ enhan_data=/scratch/near/CHiME3/v2/replayMessl/average/
+ '[' '!' -d data/lang ']'
+ '[' '!' -d exp/tri3b_tr05_multi_enhanced ']'
+ echo 'error, execute local/run_gmm.sh, first'
error, execute local/run_gmm.sh, first
+ exit 1
I would execute local/run_gmm.sh
first, like the error message says. Unless you called it incorrectly. What was the command you used to call it?
./run.sh --do-ami false --do-reverb false --stage 3 --enhan-chime3 $SHORTNAME
$SHORTNAME=lstmc2Avg
And you did the linking and everything to create the appropriate directory in data/chime3
?
Yes. I indeed changed one line in run.sh. chime3_data=/export/ws15-ffs-data/corpora/chime3/CHiME3 I changed it to chime3_data=/home/data/CHiME3 Is this correct?
That should be correct, unless there's an extra subdirectory in our chime3 directory that this should point to.
Now I'm close to the correct way...
gmm dir not found: /export/ws15-ffs-data/swatanabe/tools/kaldi-trunk/egs/ami/s5/exp/mdm8/tri4a
Where can I find this model?
Good question. I've copied it to /scratch/mim/kaldi/ami/exp/mdm8/tri4a/
Can we submit the job to the server now? If no, I need to modify the code.
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance. steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance. queue.pl: error submitting jobs to queue (return status was 256) queue log file is data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* -o data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.log -l arch=*64* -t 1:4 /scratch/near/kaldi-jsalt/egs/jsalt15-ffs/s5/data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.sh >>data/chime3/lstmc2Avg/et05_real_lstmc2Avg/q/make_mfcc_et05_real_lstmc2Avg.log 2>&1 queue.pl: error submitting jobs to queue (return status was 256) queue log file is data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* -o data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.log -l arch=*64* -t 1:4 /scratch/near/kaldi-jsalt/egs/jsalt15-ffs/s5/data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.sh >>data/chime3/lstmc2Avg/dt05_real_lstmc2Avg/q/make_mfcc_dt05_real_lstmc2Avg.log 2>&1 Unable to run job: warning: near your job is not allowed to run in any queue Your job-array 6.1-4:1 ("make_mfcc_dt05_real_lstmc2Avg.sh") has been submitted. Exiting. Unable to run job: warning: near your job is not allowed to run in any queue Your job-array 5.1-4:1 ("make_mfcc_et05_real_lstmc2Avg.sh") has been submitted. Exiting.
I found what's wrong with it. When I try to submit a sample script to SGE. I got this error:
Unable to run job: warning: near your job is not allowed to run in any queue Your job 16 ("Sleeper1") has been submitted. Exiting.
I think I need to be added to some group to submit my job?
@arsyed is there a group to add Zhaoheng to for sge?
Thanks for spotting that. There's an "arusers" group attached to the "mainqueue". I've added user "near" to this group (using the qmon tool). That's the only difference I noticed with my username, so hopefully this works. Can you try submitting a job again?
If this works, we can document this on the wiki.
I have succeeded submitting the job. It indeed works. Thanks a lot.
I also need the dnn directory:
steps/nnet/decode.sh: missing file /export/ws15-ffs-data/swatanabe/tools/kaldi-trunk/egs/ami/s5/exp/mdm8/dnn4_pretrain-dbn_dnn/final.nnet
I think I found it, is it in/scratch/mim/kaldi/jsaltRecognizer/exp/mdm8/dnn4_pretrain-dbn_dnn
?
I've copied it over.
Got it to work.
The spreadsheet here: Spreadsheet
There is no word error rate for simulated data set, right?
I think I only ran messl on the real test and dev files originally, not the simulated ones. But you ran it on the simulated files, right? If so, we can measure wer on them too. Just keep them separate.
The result WER is higher than the result you used MVDR . But I think it uses Channel 0 in the model, while we don't in our experiments. So it means our model is still comparable to the previous ones, right?
The results in my paper didn't use channel 0, just like here. So it's doing similarly to before, i.e., hasn't improved on it.
Why did I get so high WER on MESSL files? It's 72.59% on dev set. I just run replayMessl on it. What about your previous experiment for it?
My files for the best performing system in those experiments (19.7% WER dev and 32.6% WER test on the real data only for both) are in /scratch/mim/kaldi/ami/exp/ihm/tri3_ali/replayXcMaskSoudenMaxSup09db
you can try running it on them directly. You can also listen to yours and listen to those and see if they are the same. And see if any files are missing from yours (or mine).
There is no directory called replayXcMaskSoudenMaxSup09db, I assume you use the system variable for the directory right?
Oops, it's here: /scratch/mim/jsalt/data/chime3/out_beamformit/replayXcMaskSoudenMaxSup09db/
Done by Zhaoheng
Edit run.sh and local/publish_results.sh to calculate WER score for the CHiME-3 simulated dataset.
New link here: https://docs.google.com/spreadsheets/d/1bnr5zlsEeTLsMY1l3QTGfzCtFMlQezD5QviJ3yNXJc4/edit#gid=132513526
Need to figure out what else can be added to the form, so I can find a way to upload all of them at once.
Perhaps open a new issue to do this.
Thanks. So why did you reopen this? What are the new files you're evaluating? Can you create a new issue specifying what information you want to add to the form/spreadsheet?
I just added WER for simulated set. Last time we added other columns manually. If we can add them when running the code, it'll be easier.
Ah, ok, thanks. I think the code from JSALT should automatically upload both the real and synthetic test results, but maybe not. If not, you can make a new google form / spreadsheet that can accept it.
Use the kaldi recognizer that is in
/scratch/mim/jsalt/code/asr/kaldi-jsalt/egs/jsalt15-ffs/s5
to recognize the enhanced utterances. It was copied from another system, so there might be errors caused by paths being set incorrectly.You should also modify
local/publish_results.sh
to use the google spreadsheet we setup.Here is an example of how to run it, but using the paths from the other system:
This might work on crescent: