Describe the bug
Unable to train model using wenet/examples/swbd/s0/run.sh
To Reproduce
Steps to reproduce the behavior:
Go to wenet/examples/swbd/s0/run.sh
Modify swbd1_dir and eval2000_dir
Modify
export CUDA_VISIBLE_DEVICES="0"
stage=-1 # start from 0 if you need to start from data preparation
run run.sh
Expected behavior
Correctly download data and start training
Log and Errors
*** Downloading trascriptions and dictionary ***
--2024-06-17 13:42:46-- http://www.openslr.org/resources/5/switchboard_word_alignments.tar.gz
Resolving www.openslr.org (www.openslr.org)... 46.101.158.64
Connecting to www.openslr.org (www.openslr.org)|46.101.158.64|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://openslr.elda.org/resources/5/switchboard_word_alignments.tar.gz [following]
--2024-06-17 13:42:46-- http://openslr.elda.org/resources/5/switchboard_word_alignments.tar.gz
Resolving openslr.elda.org (openslr.elda.org)... 141.94.109.138, 2001:41d0:203:ad8a::
Connecting to openslr.elda.org (openslr.elda.org)|141.94.109.138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 49651293 (47M) [application/x-gzip]
Saving to: ‘switchboard_word_alignments.tar.gz’
switchboard_word_alignments.tar.gz 100%[======================================================================================================>] 47.35M 12.6MB/s in 5.1s
2024-06-17 13:42:52 (9.28 MB/s) - ‘switchboard_word_alignments.tar.gz’ saved [49651293/49651293]
File data/local/dict_nosp/lexicon0.txt is read-only; trying to patch anyway
patching file data/local/dict_nosp/lexicon0.txt
Prepared input dictionary and phone-sets for Switchboard phase 1.
Warning: expected 2435 or 2438 data data files, found 0
Switchboard-1 data preparation succeeded.
local/swbd1_data_prep.sh: line 144: utils/fix_data_dir.sh: No such file or directory
Expecting directory <my-path>/swbd/LDC2002S09/hub5e_00/english to be present
tools/subset_data_dir.sh: reducing #utt from 264333 to 4000
tools/subset_data_dir.sh: reducing #utt from 264333 to 260333
Reduced number of utterances from 260333 to 192827
cp: cannot stat 'data/eval2000/text': No such file or directory
run.sh: line 82: data/eval2000/text.org2: No such file or directory
cut: data/eval2000/text.org: No such file or directory
awk: fatal: cannot open file `data/eval2000/text.org' for reading (No such file or directory)
run.sh: line 83: data/eval2000/text: No such file or directory
tools/fix_data_dir.sh: no such file data/eval2000/utt2spk
Additional context and questions
Tried to change utils/fix_data_dir.sh to tools/fix_data_dir.sh: got rid of local/swbd1_data_prep.sh: line 144: utils/fix_data_dir.sh: No such file or directory error.
find -L $SWBD_DIR -iname '*.sph' returns empty. Is there a pre-requisite step missing in the script?
Describe the bug Unable to train model using
wenet/examples/swbd/s0/run.sh
To Reproduce Steps to reproduce the behavior:
Expected behavior Correctly download data and start training
Log and Errors
Additional context and questions
utils/fix_data_dir.sh
totools/fix_data_dir.sh
: got rid oflocal/swbd1_data_prep.sh: line 144: utils/fix_data_dir.sh: No such file or directory
error.find -L $SWBD_DIR -iname '*.sph'
returns empty. Is there a pre-requisite step missing in the script?