Open agarwalchaitanya opened 3 years ago
@agarwalchaitanya Could you try to comment out local/ami_prepare_dict.sh
(line: 120) in run.sh
?
@agarwalchaitanya Could you try to comment out
local/ami_prepare_dict.sh
(line: 120) inrun.sh
?
that helps remove the error but it fails somewhere within stage 0
============================================================================
AMI
============================================================================
============================================================================
Data Preparation (stage:0)
============================================================================
+ dir=/home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
+ mkdir -p /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
+ echo 'Downloading annotations...'
Downloading annotations...
+ amiurl=http://groups.inf.ed.ac.uk/ami
+ annotver=ami_public_manual_1.6.1
+ annot=/home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/ami_public_manual_1.6.1
+ logdir=/home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
+ mkdir -p /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/log
+ '[' '!' -f /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/ami_public_manual_1.6.1.zip ']'
+ '[' '!' -d /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/annotations ']'
+ '[' '!' -f /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads/annotations/AMI-metadata.xml ']'
+ local/ami_xml2text.sh /home/asr/neural_sp_assets/preprocessed_data/ami/local/downloads
local/ami_xml2text.sh: line 19: [: openjdk version "11.0.10" 2021-01-19: integer expression expected
local/ami_xml2text.sh. Java not found. Will download exported version of transcripts.
--2021-02-11 17:16:05-- http://groups.inf.ed.ac.uk/ami/AMICorpusAnnotations/ami_manual_annotations_v1.6.1_export.gzip
Resolving groups.inf.ed.ac.uk (groups.inf.ed.ac.uk)... 129.215.202.26
Connecting to groups.inf.ed.ac.uk (groups.inf.ed.ac.uk)|129.215.202.26|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3725858 (3.6M) [application/x-troff-man]
Saving to: '/home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/ami_manual_annotations_v1.6.1_export.gzip'
/home/asr/neural_sp_assets/pr 100%[=================================================>] 3.55M 2.49MB/s in 1.4s
2021-02-11 17:16:07 (2.49 MB/s) - '/home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/ami_manual_annotations_v1.6.1_export.gzip' saved [3725858/3725858]
+ wdir=/home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations
+ '[' '!' -f /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts1 ']'
+ echo 'Preprocessing transcripts...'
Preprocessing transcripts...
+ local/ami_split_segments.pl /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts1 /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
+ for dset in train eval dev
+ grep -f local/split_train.orig /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
+ for dset in train eval dev
+ grep -f local/split_eval.orig /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
+ for dset in train eval dev
+ grep -f local/split_dev.orig /home/asr/neural_sp_assets/preprocessed_data/ami/local/annotations/transcripts2
sdm
In total, 0 files were found.
Warning: expected 169 data data files, found 0
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
AMI sdm1 data preparation succeeded.
In total, 0 files were found.
local/ami_sdm_scoring_data_prep.sh. Applying following fixes to segments
s/^AMI_IB4004_SDM_MIO039_0036179_0036400 AMI_IB4004_SDM 361.79 364$/AMI_IB4004_SDM_MIO039_0036179_0036400 AMI_IB4004_SDM 362.28 364/;
convert2stm: Recording-id AMI_ES2011a_SDM not defined in reco2file_and_channel file /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/dev_orig/reco2file_and_channel at local/convert2stm.pl line 70.
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
AMI sdm1 scenario and dev set data preparation succeeded.
In total, 0 files were found.
convert2stm: Recording-id AMI_EN2002a_SDM not defined in reco2file_and_channel file /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/eval_orig/reco2file_and_channel at local/convert2stm.pl line 70.
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
AMI sdm1 scenario and eval set data preparation succeeded.
utils/data/get_utt2dur.sh: segments file does not exist so getting durations from wave files
utils/data/get_utt2dur.sh: successfully obtained utterance lengths from sphere-file headers
utils/data/get_utt2dur.sh: computed /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/train_orig/utt2dur
utils/data/modify_speaker_info.sh: copied data from /home/asr/neural_sp_assets/preprocessed_data/ami/sdm1/train_orig to /home/asr/neural_sp_assets/preprocessed_data/ami/train_sdm1, number of speakers changed from 0 to 0
Usage: utils/validate_data_dir.sh [--no-feats] [--no-text] [--non-print] [--no-wav] [--no-spk-sort] <data-dir>
The --no-xxx options mean that the script does not require
xxx.scp to be present, but it will check it if it is present.
--no-spk-sort means that the script does not require the utt2spk to be
sorted by the speaker-id in addition to being sorted by utterance-id.
--non-print ignore the presence of non-printable characters.
By default, utt2spk is expected to be sorted by both, which can be
achieved by making the speaker-id prefixes of the utterance-ids
e.g.: utils/validate_data_dir.sh data/train
Hi, I'm trying to run the
ami
recipe but it's failing with the following trace. Are there any leads on this?