kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.
http://kaldi-asr.org
Other
14.24k stars 5.32k forks source link

Multi threading Online nnet3 decoder issue #2714

Closed zhiweizhong closed 6 years ago

zhiweizhong commented 6 years ago

I have write a multi-threaded application based on the code of online2-wav-nnet3-latgen-faster.cc, but the result for the same audio file is tiny different.

How can I fix this ?

note: I use aishell2 to train the asr model.

danpovey commented 6 years ago

It's likely due to dithering in the feature extraction. Anyway, it's the intended behavior. You should change the feature extraction options to fix it if it bothers you (e.g. --dither=0 --energy-floor=1).

On Mon, Sep 17, 2018 at 6:58 AM zhiweizhong notifications@github.com wrote:

I have write a multi-threaded application based on the code of online2-wav-nnet3-latgen-faster.cc, but the result for the same audio file is tiny different.

How can I fix this ?

note: I use aishell2 to train the asr model.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/2714, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVuwZ_ZXiuZJ6kQ1FAYTiJvsrvrNycks5ub4BWgaJpZM4WrrkJ .

zhiweizhong commented 6 years ago

thank you for your answer. @danpovey . But the result is still tiny different when I use multi thread, and my mfcc.conf is

--use-energy=false       # use average of log energy, not energy.
--sample-frequency=16000 # AISHELL-2 is sampled at 16kHz
--num-mel-bins=40        # similar to Google's setup.
--num-ceps=40            # there is no dimensionality reduction.
--low-freq=20            # low cutoff frequency for mel bins
--high-freq=-400         # high cutoff frequency, relative to Nyquist of 8000 (=7600)
--dither=0
--energy-floor=1

When I use single thread, the result is exactly the same. I think maybe the reason is not about mfcc feature...

danpovey commented 6 years ago

When you use a single thread, things are computed in a fixed order. You should test whether if you put the utterances in a different order or use a different random seed (you'd have to change the code), it makes a difference. Perhaps you are not passing those options in properly. It won't get it from conf/, it will get it from the copy that's in the _online directory.

On Tue, Sep 18, 2018 at 5:14 AM zhiweizhong notifications@github.com wrote:

thank you for your answer. @danpovey https://github.com/danpovey . But the result is still tiny different when I use multi thread, and my mfcc.conf is

--use-energy=false # use average of log energy, not energy. --sample-frequency=16000 # AISHELL-2 is sampled at 16kHz --num-mel-bins=40 # similar to Google's setup. --num-ceps=40 # there is no dimensionality reduction. --low-freq=20 # low cutoff frequency for mel bins --high-freq=-400 # high cutoff frequency, relative to Nyquist of 8000 (=7600) --dither=0 --energy-floor=1

When I use single thread, the result is exactly the same. I think maybe the reason is not about mfcc feature...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/2714#issuecomment-422318901, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu16lUk1USe7OnqnKxtt6BdmBYmaQks5ucLl2gaJpZM4WrrkJ .

zhiweizhong commented 6 years ago
  1. I'm sure the options are passing in properly.

  2. I have saw you replay in the Kaldi-help Google groups mfcc features significantly different if run more than once ,and I use OpenBLAS , maybe the reason is about this.

  3. In this MFCC feature extraction, it says the the differences were caused by multi-threading in the ivector extraction.

I'm confused, Which statement is correct?

When you use a single thread, things are computed in a fixed order. You should test whether if you put the utterances in a different order or use a different random seed (you'd have to change the code), it makes a difference. Perhaps you are not passing those options in properly. It won't get it from conf/, it will get it from the copy that's in the _online directory. On Tue, Sep 18, 2018 at 5:14 AM zhiweizhong @.***> wrote: thank you for your answer. @danpovey https://github.com/danpovey . But the result is still tiny different when I use multi thread, and my mfcc.conf is --use-energy=false # use average of log energy, not energy. --sample-frequency=16000 # AISHELL-2 is sampled at 16kHz --num-mel-bins=40 # similar to Google's setup. --num-ceps=40 # there is no dimensionality reduction. --low-freq=20 # low cutoff frequency for mel bins --high-freq=-400 # high cutoff frequency, relative to Nyquist of 8000 (=7600) --dither=0 --energy-floor=1 When I use single thread, the result is exactly the same. I think maybe the reason is not about mfcc feature... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2714 (comment)>, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu16lUk1USe7OnqnKxtt6BdmBYmaQks5ucLl2gaJpZM4WrrkJ .

danpovey commented 6 years ago

1.

2.

I have saw you replay in the Kaldi-help Google groups mfcc features significantly different if run more than once https://groups.google.com/d/msg/kaldi-help/LOD4A7Z9hYY/WALJ0vgFAgAJ

This would be due to the dithering.

1.

and I use OpenBLAS , maybe the reason is about this.

If you use multi-threaded OpenBLAS (i.e. multiple threads at the BLAS library level... I think you have to use a different version of the library and/or set something like OMP_NUM_THREADS or OPENBLAS_NUM_THREADS), then nothing will be fully reproducible.

1.

In this MFCC feature extraction https://groups.google.com/d/msg/kaldi-help/fQS3VmQ0eCA/wiGpx3hgMgAJ, it says the the differences were caused by multi-threading in the ivector extraction.

I'm confused, Which statement is correct?

They are probably both correct, but both conditions have to be present. If the BLAS is deterministic, then to get nondeterminism you have to have multiple threads in which MFCCs (with dithering) are computed in a nondeterministic order.

When you use a single thread, things are computed in a fixed order. You should test whether if you put the utterances in a different order or use a different random seed (you'd have to change the code), it makes a difference. Perhaps you are not passing those options in properly. It won't get it from conf/, it will get it from the copy that's in the online directory. … <#m-2282178358222328407_> On Tue, Sep 18, 2018 at 5:14 AM zhiweizhong @.***> wrote: thank you for your answer. @danpovey https://github.com/danpovey https://github.com/danpovey . But the result is still tiny different when I use multi thread, and my mfcc.conf is --use-energy=false # use average of log energy, not energy. --sample-frequency=16000 # AISHELL-2 is sampled at 16kHz --num-mel-bins=40 # similar to Google's setup. --num-ceps=40 # there is no dimensionality reduction. --low-freq=20 # low cutoff frequency for mel bins --high-freq=-400 # high cutoff frequency, relative to Nyquist of 8000 (=7600) --dither=0 --energy-floor=1 When I use single thread, the result is exactly the same. I think maybe the reason is not about mfcc feature... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2714 (comment) https://github.com/kaldi-asr/kaldi/issues/2714#issuecomment-422318901>, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVu16lUk1USe7OnqnKxtt6BdmBYmaQks5ucLl2gaJpZM4WrrkJ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/issues/2714#issuecomment-423793463, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVuxLoTsq_wuqrdiOnZbpHG0Gll2dNks5udx8LgaJpZM4WrrkJ .