Open weiwenying opened 3 years ago
Now, I've solved the problem. This is caused when the dataset is empty. when you run:
./scripts/common_voice_convert.sh <data_dir> <# of threads>
You want to convert MP3 to WAV format:
# convert before
290a72db5e6654c2fcfcf3ff37c455264d4d598dadb1a5bfeb7c268f075894fff7cf31dafec97af3720ff178f0.mp3
# convert after
290a72db5e6654c2fcfcf3ff37c455264d4d598dadb1a5bfeb7c268f075894fff7cf31dafec97af3720ff1.wav
but ./scripts/common_voice_convert.sh
actually:
# convert before
290a72db5e6654c2fcfcf3ff37c455264d4d598dadb1a5bfeb7c268f075894fff7cf31dafec97af3720ff178f0.mp3
# convert after
290a72db5e6654c2fcfcf3ff37c455264d4d598dadb1a5bfeb7c268f075894fff7cf31dafec97af3720ff178f0.wav
This is not what you want. So, after convert, and then rename the wav files, using Python script:
import pathlib
# your datasets clips path
src_dir = "/home/weiwenying/projects/Celex/rnnt-speech-recognition/user_opt/zh-TW/clips"
for path in pathlib.Path(src_dir).glob("*.wav"):
new_stem = str(path.stem)[:-4]
new_name = new_stem + ".wav"
new_path = path.with_name(new_name)
path.rename(new_path)
Now, train.tfrecord is not empty:
total 328M
drwxrwxr-x 2 weiwenying weiwenying 4.0K 2月 4 10:33 .
drwxrwxr-x 5 weiwenying weiwenying 4.0K 2月 4 10:32 ..
-rw-rw-r-- 1 weiwenying weiwenying 105M 2月 4 11:47 dev.tfrecord
-rw-rw-r-- 1 weiwenying weiwenying 33K 2月 4 10:33 encoder.subwords
-rw-rw-r-- 1 weiwenying weiwenying 118M 2月 4 11:47 test.tfrecord
-rw-rw-r-- 1 weiwenying weiwenying 106M 2月 4 11:47 train.tfrecord
and then, run:
python run_rnnt.py \
--mode train \
--data_dir <path to data directory>
Normal work is as follows:
Epoch: 4, Batch: 34, Global Step: 190, Step Time: 1.0850, Loss: 8.8683
Epoch: 4, Batch: 35, Global Step: 191, Step Time: 1.0003, Loss: 8.8519
Epoch: 4, Batch: 36, Global Step: 192, Step Time: 0.9517, Loss: 8.8606
Epoch: 4, Batch: 37, Global Step: 193, Step Time: 0.7079, Loss: 8.8660
Epoch: 4, Batch: 38, Global Step: 194, Step Time: 0.8511, Loss: 8.8473
EPOCH RESULTS: Loss: 8.8473
That all!
When:
Print:
ubuntu18.04LTS
, andconda list
:EPOCH RESULTS: Loss: 0.0000
, It's seem not working?