vrenkens / nabu

Code for end-to-end ASR with neural networks, build with TensorFlow
MIT License
108 stars 43 forks source link

kaldi use mfcc feature,but fbank in config/recipes/DNN/WSJ/feature_processor.cfg #19

Closed fanlu closed 6 years ago

fanlu commented 6 years ago

kaldi use mfcc for feature input, and in this project I find fbank in config/recipes/DNN/WSJ/feature_processor.cfg,Do these two different feature lead to the DNN training not converging?

vrenkens commented 6 years ago

This is normal behavior. Kaldi uses MFCC features to train the GMM. The GMM is then used to align the transcription to the speech. Then the DNN is trained to map the fbank features to the alignments. If the DNN training does not converge it probably has some other reason.

fanlu commented 6 years ago

I use the aishell task and the default dnn structure.should I change the structure?

vrenkens commented 6 years ago

To be honnest, I don't really know. If you are using a standard DNN structure, I suggest you try using Kaldi. If you want to then move to more exotic structures, you could first try to reproduce the result using Nabu with a similar architecture before making changes to the model.

fanlu commented 6 years ago

Do you still retain the original experimental results?finally I want to make a cldnn structure.But I still want to ensure that my experimental process does not have problems

vrenkens commented 6 years ago

What do you mean with retain the original experimental results?

fanlu commented 6 years ago

kaldi‘s experimental result

vrenkens commented 6 years ago

Ok and what do you mean with retaining that result? If you use kaldi it is stored in the egs folder you running it from. For Nabu all intermediate steps are stored in the expdir

fanlu commented 6 years ago

Yes. I copied the run.sh from egs to Kaldi script train_gmm.sh,And run align_data.sh to use dev set data.and compute_prior.py,and ./run data to create expdir , and ./run train the nn.I want to see your experiment step and result. Is not the same as my steps?

vrenkens commented 6 years ago

Have you looked at the README? There the steps are explained. I don't really understand why you copied the run.sh to the train_gmm.sh. As far as I can tell the alignment process is the same for the aishell task.

The rest of the steps seem correct.

fanlu commented 6 years ago

I read readme carefully. Run.sh and train_gmm.sh are not same,but train_gmm.sh has most code of run.sh before train_tdnn,contains train_mono train_tri to train_5a and alignment to pdf.I do this is want to use your script than run.sh,It looks like it got rid of kaldi.and I do not want to switch between projects.So Could you show me your converge result?

vrenkens commented 6 years ago

They are indeed not the same because the run.sh contains the kaldi data prep as well. I have no results on this task, so I'm afraid I cannot help you with it.

fanlu commented 6 years ago

Do you have the result of config/recipes/dnn/wsj?

vrenkens commented 6 years ago

Its been a while, let me check :)

fanlu commented 6 years ago

@vrenkens Do you have any good results?I use 3 layer with 512 hidden units,and dropout to 1.there is also not converge

vrenkens commented 6 years ago

I do not have any results anymore, I'm running the experiments now to see what happens. I will let you know when I get the results :)

vrenkens commented 6 years ago

Hey fanlu, There was a bug in the decoding script, so you should probably pull the new code. I trained a DNN on WSJ. I just used the recipe as is, which is probably not very good. In particular I was using layer normalization and dropout, which is probably not a good idea.

Here you can see the plots for training

validatino loss training loss

I got a WER of 11% in the end which is far from state of the art. I am going to run some more experiments with different recipes.

fanlu commented 6 years ago

awesome.I will try aishell later.thanks.Could you share your experiment results When you finish?

vrenkens commented 6 years ago

I will :)

fanlu commented 6 years ago

hi @vrenkens ,Can you share some tf debug trick to me ?Or how do you find the reason of not converage?

vrenkens commented 6 years ago

I actually did not have problems with convergence. What I normally do for debugging is first look at the computational graph using tensorboard. Just to check if everything looks the way I expect it to (correct shapes, connections, ...).

Then I will typically look at the histograms evolving over time.

As a last resort I use tf.Print statements to really look at the individual values to see if everything makes sence.

You could also look at you input features. Do they look normal, are they normalized correctly (zero mean and unit variance)