SeanNaren / deepspeech.torch

Speech Recognition using DeepSpeech2 network and the CTC activation function.
MIT License
260 stars 73 forks source link

Possible error in docs #59

Closed ShibbyContinuum closed 7 years ago

ShibbyContinuum commented 7 years ago
audio input: 128 x 500 Tensor -- the audio data (frequency x time)
truth text: 'deep speech is cool' -- The truth text used in evaluations/validation
truth label: {4,5,5,16,27,19,16,5,5,3,8,27,9,19,3,15,12} -- The label used in training

I believe the truth label should be:

{ 4,5,5,16,27,19,16,5,5,3,8,27,9,19,27,3,15,15,12 }

Notice the second 15 in the second to last position. Unless 'oo' is only supposed to be represented by a single "15".

SeanNaren commented 7 years ago

Thanks :)