Uberi / speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.
https://pypi.python.org/pypi/SpeechRecognition/
BSD 3-Clause "New" or "Revised" License
8.42k stars 2.4k forks source link

recognize_google's response is weird; unit tests failed #717

Closed ftnext closed 11 months ago

ftnext commented 11 months ago

https://github.com/Uberi/speech_recognition/actions/runs/6996242087/job/19031904081

======================================================================
FAIL: test_google_chinese (tests.test_recognition.TestRecognition)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/speech_recognition/speech_recognition/tests/test_recognition.py", line 35, in test_google_chinese
    self.assertEqual(r.recognize_google(audio, language="zh-CN"), u"砸自己的脚")
AssertionError: '砸自己的' != '砸自己的脚'
- 砸自己的
+ 砸自己的脚
?     +

======================================================================
FAIL: test_google_english (tests.test_recognition.TestRecognition)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/speech_recognition/speech_recognition/tests/test_recognition.py", line 25, in test_google_english
    self.assertIn(r.recognize_google(audio), ["123", "1 2 3", "one two three"])
AssertionError: '1 2' not found in ['123', '1 2 3', 'one two three']

----------------------------------------------------------------------
Ran 31 tests in 25.823s

FAILED (failures=2, skipped=8)
ftnext commented 11 months ago

API response changed.

See response_text in debugger https://github.com/Uberi/speech_recognition/blob/3.10.0/speech_recognition/__init__.py#L713

test_google_chinese

(Pdb) response_text
'{"result":[]}\n{"result":[{"alternative":[{"transcript":"砸自己的","confidence":0.90974784}],"final":true}],"result_index":0}\n'

test_google_english

in response_text

{"result":[{"alternative":[{"transcript":"1 2","confidence":0.49585345},{"transcript":"one two three","confidence":0.42899391}

"1 2" is max confidence, so recognize_google returns it


IMO: Currently, It is a test for Google's API. I want to change it to a test of the parsing logic of the return value by mocking the communication with Google's API.