Open yh1008 opened 7 years ago
Use feat-to-dim
from Kaldi. This will usually be 13 dimensions (MFCCs). We also add delta and acceleration coefficients (add-deltas
) in the script, which makes it 13*3=39. Then we concatenate 11 frames into one (splice-feats
), which makes it 39*11=429.
So it is expected that all audio files have the same number of frames, or is it possible to make it only extract a certain number of frames?
No, the dimension of the frame is independent of the number of frames in an utterance. Typically no two audio files will have the same number of frames. However the dimension of each frame must be the same, to train models.
hm... would that not only occur if the sample rate is changed/different for two audio files?
Typically sampling rate is same across the dataset. Even if the sampling rate is different, we could extract the same number of cepstra (or any other features) that form frames, from the audio files. So the frame size is always the same.
Thanks for the explanations!
Just to verfiy, in my case, after I trained delta+deltadelta using steps/train_deltas.sh
, I also applied LDA+MLLT transformation using steps/train_lda_mllt.sh --splice-opts "--left-context=3 --right-context=3"
, the default dimension output of LDA is set to 40.
With the above set up, the system produces 40*11 = 440 as inputFeatDim
then?
Yes.
It would be better if it throws a warning or exception if the user is not aware of the hard coded in_feat_dim and the specified feature has dimension mismatch. I ran TIMIT using mismatched feautre (41-dimension fbank) and the default script ran completely fine and yields satisfactory results (23% PER)
I've always gotten error when the dimension mismatched, precisely when the empty array of size self.inputFeatDim
is appended with the received data:
self.x = numpy.concatenate ((self.x[self.batchPointer:], x))
I don't think a dimension mismatch will allow it to progress at this point. Can you recheck your experiment?
Hi M Kumar,
I got the same error despite that I put the correect value 440 of self.inputFeatDim.
self.x = numpy.concatenate ((self.x[self.batchPointer:], x))
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Have you any idea about How I can fix the error?
Hello Mr. Kumar,
I noticed that you set
I am wondering how can I check the inputFeatDim of my dataset?
Thank you very much!