Closed stephenvxx closed 6 years ago
@AdolfVonKleist
What dataset are you training on?
I use my dataset (Be formatted as AN4). The firstly, i make LMDB sample rate 8000. Then i modify DeepspeechModel.lua, rnnInputsize= 1*81. I still change rnnInputSize !
conv:add(nn.SpatialConvolution(1, 32, 11, 41, 2, 2)) conv:add(nn.SpatialBatchNormalization(32)) conv:add(nn.Clamp(0, 20)) conv:add(nn.SpatialConvolution(32, 32, 11, 21, 2, 1)) conv:add(nn.SpatialBatchNormalization(32)) conv:add(nn.Clamp(0, 20))
I dont understand this code, why nInputPlane=1, etc... I dont know to modify SpatialConvolution and rnnInputSize .
What is the size of the shortest clip? Could you make sure its above 0.5 seconds long? Otherwise it might not be large enough to go through the convolutional layers.
@SeanNaren I cut all my wav above 1.0 seconds, but it is not change. Is a bug of cuda r5 ?
Any chance this could help figure it out? https://github.com/SeanNaren/deepspeech.torch/issues/62#issuecomment-255733363
Sometimes, i cut my wav files above 1.0 seconds (5000 files). The Bug error message is on #62 and i fix it. When i use all my audio files (1 Million Files) not successfully.
I had a error :
In Deepspeechmodel.lua, i change rnnInputsize = 1 * 81 ( Because size output :20x1x81x13) Please help me ! @SeanNaren