domerin0 / rnn-speech

Character level speech recognizer using ctc loss with deep rnns in TensorFlow.
MIT License
77 stars 31 forks source link

fail to create audio_dataset #43

Closed jesuistay closed 6 years ago

jesuistay commented 6 years ago

I got the whole librispeech corpus on my computer, and i wish to train this model on only that corpus. I've tried to place the corpus in the data/Librispeech/train folder in various configurations and I get

File "stt.py", line 316, in main() File "stt.py", line 30, in main train_rnn(train_set, test_set, hyper_params, prog_params) File "stt.py", line 128, in train_rnn model, t_iterator, v_iterator = build_training_rnn(sess, hyper_params, prog_params, train_set, test_set) File "stt.py", line 48, in build_training_rnn hyper_params["max_target_seq_length"], hyper_params["signal_processing"]) File "/home/tay/Documents/rnn-speech/models/AcousticModel.py", line 916, in build_dataset audio_dataset = tf.contrib.data.Dataset.from_tensor_slices(audio_streams) File "/home/tay/.local/lib/python3.5/site-packages/tensorflow/contrib/data/python/ops/dataset_ops.py", line 473, in from_tensor_slices return TensorSliceDataset(tensors) File "/home/tay/.local/lib/python3.5/site-packages/tensorflow/contrib/data/python/ops/dataset_ops.py", line 896, in init batch_dim = flat_tensors[0].get_shape()[0] File "/home/tay/.local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 500, in getitem return self._dims[key] IndexError: list index out of range

Ive checked and the audio_streams seams to be as long as it should be, just not the right shape perhaps?

any ideas?

AMairesse commented 6 years ago

Hello, Which version of Tensorflow are you using ? You should be on the latest release (1.3). Not sure about the error but I would guess the error come from an audio file with a size of 0. I've encountered the case when setting the training in ordered mode. Since I've added an exclusion of the files which are too small so it shouldn't append anymore...

Could you try with 'dataset_size_ordering : False' in config.ini if it isn't already the case ?

One point is bothering me, on the master branch the line 'audio_dataset = tf.contrib.data.Dataset.from_tensor_slices(audio_streams)' should be on line number 914 and your error log says 916, do you have any significant modification or only added some debug traces ? Just checking you have a correct revision :-)

Also could you check in debug mode that 'input_set' actually contain relevant data (not empty) ?

jesuistay commented 6 years ago

The modifications in before line 914 was only prints. The problem seamed to have been the tensorflow version. However it was displaying version 1.3 when i printed tf. version inside the conda env. However when I reinstalled conda and tensorflow from scratch it worked without hickups.

Thanks a bunch.