Open rabiaedayilmaz opened 3 months ago
I have the same issue with speech-to-text-v2. I'll try to provide a bit more context:
I have multiple IoT-Devices at different places. Some work, some don't. I have no Idea why, or what's the difference. Software and Hardware are the same on all devices.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py", line 173, in error_remapped_callable
return _StreamingResponseIterator(
File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py", line 95, in __init__
self._stored_first_result = next(self._wrapped)
File "/usr/local/lib/python3.10/dist-packages/grpc/_channel.py", line 540, in __next__
return self._next()
File "/usr/local/lib/python3.10/dist-packages/grpc/_channel.py", line 966, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Audio chunk can be of a a maximum of 25600 bytes. Received audio of 98964 bytes instead."
debug_error_string = "UNKNOWN:Error received from peer ipv6:<REDACTED> {created_time:"2024-04-03T13:04:32.515940442+02:00", grpc_status:3, grpc_message:"Audio chunk can be of a a maximum of 25600 bytes. Received audio of 98964 bytes instead."}"
>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "speech_2_text.py", line 153, in run
self.responses = self.client.streaming_recognize(
File "/usr/local/lib/python3.10/dist-packages/google/cloud/speech_v2/services/speech/client.py", line 1884, in streaming_recognize
response = rpc(
File "/usr/local/lib/python3.10/dist-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/google/api_core/retry.py", line 372, in retry_wrapped_func
return retry_target(
File "/usr/local/lib/python3.10/dist-packages/google/api_core/retry.py", line 207, in retry_target
result = target()
File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py", line 177, in error_remapped_callable
raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Audio chunk can be of a a maximum of 25600 bytes. Received audio of 98964 bytes instead.
Note: I removed the IPv6 from the error-message.
pip3 freeze | grep google:
google-api-core==2.15.0
google-auth==2.25.2
google-cloud-speech==2.25.1
google-cloud-texttospeech==2.15.0
googleapis-common-protos==1.62.0
I happened to have this same problem with google-cloud-speech==2.23.0
as well.
As by the examples, I feed audio-data via
def generator(self):
"""acts as a blocking generator for buffered audio_data
when no data is there, the generator blocks till there is new data
this generator uses queue.Queue, thus it is thread-safe
Yields:
bytes: the buffered audio
"""
while not self.closed:
# use blocking get
chunk = self._buff.get()
# return when stop signal detected (None)
if chunk is None:
return
data = [chunk]
# consume the rest of the queue
while True:
try:
chunk = self._buff.get(block=False)
if chunk is None:
return
data.append(chunk)
except queue.Empty:
break
# yield result
yield b"".join(data)
The Documentation here states, that 25 KB is the maximum.
I attempted a fix:
# yield result
bytes_chunk = b"".join(data)
for chunk in [bytes_chunk[x:x+25600] for x in range(0, len(bytes_chunk), 25600)]:
yield chunk
Does get rid of this exact error, but then we just get another error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py", line 173, in error_remapped_callable
return _StreamingResponseIterator(
File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py", line 95, in __init__
self._stored_first_result = next(self._wrapped)
File "/usr/local/lib/python3.10/dist-packages/grpc/_channel.py", line 540, in __next__
return self._next()
File "/usr/local/lib/python3.10/dist-packages/grpc/_channel.py", line 966, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.CANCELLED
details = "The operation was cancelled."
debug_error_string = "UNKNOWN:Error received from peer ipv6:<REDACTED> {created_time:"2024-04-04T10:01:14.580325845+02:00", grpc_status:1, grpc_message:"The operation was cancelled."}"
>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "speech_2_text.py", line 155, in run
self.responses = self.client.streaming_recognize(
File "/usr/local/lib/python3.10/dist-packages/google/cloud/speech_v2/services/speech/client.py", line 1884, in streaming_recognize
response = rpc(
File "/usr/local/lib/python3.10/dist-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/google/api_core/retry.py", line 372, in retry_wrapped_func
return retry_target(
File "/usr/local/lib/python3.10/dist-packages/google/api_core/retry.py", line 207, in retry_target
result = target()
File "/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py", line 177, in error_remapped_callable
raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.Cancelled: 499 The operation was cancelled.
Note: I removed the IPv6 from the error-message.
I searched all over the internet but all I could find people that have same problems with me. Recently, speech v2 is released and there sample codes for various tasks. The most relevant sample is streaming speech recognition on a local file.
Whenever I try to implement for microphone, like we did in speech_v1p1beta1, an error occurs. The last error I stuck on is:
Google Speech Error: 400 Audio chunk can be of a a maximum of 25600 bytes. Received audio of 253952 bytes instead.
I assume it occurs because I can not define and split into chunk size for incoming microphone audio.
There is a need for Streaming Audio from Microphone and Performing Speech Recognition for Speech v2 API sample code in docs.