Open zx1292982431 opened 11 months ago
Hello, could you provide the command that you executed and the full terminal output?
Have you changed /path/to/wsj
to the path of WSJ on your system?
Sorry for late reply!
After setting the Kaldi path, I use make WSJ_DIR=/data/lzx/wsj0 SMS_WSJ_DIR=/data/lzx/Datasets/SMS_WSJ
to generate sms_wsj dataset, but I got a error:
creating /data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean.json
python -m sms_wsj.database.wsj.create_json \
with json_path=/data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean.json database_dir=/data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean as_wav=True
WARNING - Create wsj json - No observers have been added to this run
INFO - Create wsj json - Running command 'create_database'
INFO - Create wsj json - Started
ERROR - Create wsj json - Failed after 0:00:00!
Traceback (most recent calls WITHOUT Sacred internals):
File "/data/lzx/SpatialNet/sms_wsj/sms_wsj/database/wsj/create_json.py", line 293, in create_database
transcriptions = get_transcriptions(database_dir, database_dir)
File "/data/lzx/SpatialNet/sms_wsj/sms_wsj/database/wsj/create_json.py", line 170, in get_transcriptions
data_dict["clean word"] = normalize_transcription(word, wsj_root)
File "/data/lzx/SpatialNet/sms_wsj/sms_wsj/database/wsj/create_json.py", line 186, in normalize_transcription
assert len(transcriptions) > 0, 'No transcriptions to clean up.'
AssertionError: No transcriptions to clean up.
make: *** [Makefile:32: /data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean.json] Error 1
May I ask if you have encountered the similar problem and how to fix it?
This error means, the code was not able to find the transcriptions.
I guess, the code was not able to find the *.dot
and *.pth
files in /data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean
.
Could you execute the following commands and report the output:
find /data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean -iname "*.dot" | wc -l
find /data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean -iname "*.ptx" | wc -l
find /data/lzx/Datasets/SMS_WSJ/wsj_8k_zeromean -iname "*.wav" | wc -l
I got the following output:
/net/db/sms_wsj/wsj_8k_zeromean$ find . -iname "*.dot" | wc -l
3585
/net/db/sms_wsj/wsj_8k_zeromean$ find . -iname "*.ptx" | wc -l
3547
/net/db/sms_wsj/wsj_8k_zeromean$ find . -iname "*.wav" | wc -l
129106
Maybe something went wrong, when creating the wsj_8k_zeromean
folder.
I guess the /data/lzx/wsj0
folder contains only the WSJ0 files. If that is correct, you have to delete all generated files and change the call to specify the WSJ0 and WSJ1 folder, e.g. make WSJ0_DIR=/data/lzx/wsj0 WSJ1_DIR=/data/lzx/wsj1 SMS_WSJ_DIR=/data/lzx/Datasets/SMS_WSJ
(I assumed, the WSJ1 files are in /data/lzx/wsj1
).
Hello! When I execute
$ make WSJ_DIR=/path/to/wsj SMS_WSJ_DIR=/path/to/write/db/to
,return me aAssertionError: No transcriptions to clean up.
error fromsms_wsj/database/wsj/create_json.py
. How to fix it?