alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.74k stars 1.09k forks source link

Is BatchModel GPU infereance deterministic? #1196

Open mehadi92 opened 1 year ago

mehadi92 commented 1 year ago

Hi, I'm running a script https://github.com/alphacep/vosk-api/blob/master/python/example/test_gpu_batch.py with an audio file. You can download this audio file using

wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav

I'm using the Dockerfile https://github.com/alphacep/vosk-server/blob/master/docker/Dockerfile.kaldi-en-gpu

But when I'm running the test_gpu_batch.py with this same audio file it gives a different output for the same audio. e.g: like vs liked. Please check 5 outputs of a same audio file

2086-149220-0033 well i don't wish to see it any more observe phoebe turning away her eyes it is certainly very liked the old portrait
2086-149220-0033 well i don't wish to see it any more observe phoebe turning away her eyes it is certainly very like the old portrait
2086-149220-0033 well i don't wish to see it any more observe phoebe turning away her eyes it is certainly very liked the old portrait
2086-149220-0033 well i don't wish to see it any more observe phoebe turning away her eyes it is certainly very like the old portrait
2086-149220-0033 well i don't wish to see it any more observe phoebe turning away her eyes it is certainly very like the old portrait

Is there any way that the model always gives the same output for the same audio sample?

Please let me know If anything that I'm missing Thanks

nshmyrev commented 1 year ago

There is dither parameter for features too I suppose, same as https://github.com/alphacep/vosk-api/issues/868

mehadi92 commented 1 year ago

@nshmyrev According to https://github.com/alphacep/vosk-api/issues/868#issuecomment-1146699933 After setting the --dither=0 I'm getting the same output for the above audio

Could not understand your second concern that you mentioned in https://github.com/kaldi-asr/kaldi/blob/master/src/base/kaldi-math.cc#L45 has any effect on deterministic inference or not

One another thing, if --dither=0 doesn't ensure avoiding numerical overflow setting up the value to --dither=0 can cause system failure if numerical overflow occurs.