Closed nakane11 closed 1 year ago
Good! Can we also get the other candidates too?
Maybe Sphinx also returns confidence and Wit in word-level. I will try other engines when I have time.
Oh, I want to ask whether we can get multiple candidates from Google speech recognition engine. like ["hello", "hallo", "hollow"], ["0.8", "0.7", "0.6"] if so, we can use the other candidates in the future.
Sorry, I misunderstood.
I'm not sure if it is available in speech_recognition,
but if maxAlternatives
in request is greater than 1
, result can contain one or more candidates (result['alternative'][1], result['alternative'][2], ...).
"max_alternatives"
https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1p1beta1.types.RecognitionConfig#:~:text=See%20%60Language%20Support-,max_alternatives,-int%0AMaximum%20number
Currently confidence in SpeechRecognitionCandidates is set to 1.0. If show_all is true (default is false),
recognize_google
returns the raw API response as a JSON dictionary and we can get confidence value to compare results.from https://cloud.google.com/speech-to-text/docs/speech-to-text-requests#confidence-values:
Example