Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.82k stars 1.82k forks source link

Python AudioDataStream.read_data should not modify immutable bytes object #2337

Open msehnout opened 5 months ago

msehnout commented 5 months ago

Hello

This sample can lead to subtle bugs: https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/1b2d61bf540622eea1e194b8af7f38fc914bb931/samples/python/console/speech_synthesis_sample.py#L386

            audio_buffer = bytes(16000)
            total_size = 0
            filled_size = audio_data_stream.read_data(audio_buffer)
            while filled_size > 0:
                print("{} bytes received.".format(filled_size))
                total_size += filled_size
                filled_size = audio_data_stream.read_data(audio_buffer)

The problem is that bytes type is immutable in Python, but Speech SDK uses native C library and it modifies the immutable type: https://docs.python.org/3/library/stdtypes.html#bytes-objects

I stumbled upon the bug when I tried to accumulate the buffer in a separate function like this:

buffer = b""
for chunk in stream_from_azure:
   buffer += chunk

where the iterator was implemented like this:

            audio_buffer = bytes(16000)
            while filled_size > 0:
                filled_size = audio_data_stream.read_data(audio_buffer)
                yield audio_buffer[:filled_size]

And the result was corrupted.

The SDK should probably take bytearray instead: https://docs.python.org/3/library/stdtypes.html#bytearray-objects Because it is a mutable counterpart to bytes objects.

I could not find a better place to report this issue. Please let me know if I can submit it somewhere else.

yulin-li commented 5 months ago

Thanks for this issue. I agree with you that we should not modify this immutable object. We will discuss internally to update this as we don't want to introducing breaking changes at this time

yulin-li commented 5 months ago

And this is the right place to report this issue, this is the official Speech SDK repo

github-actions[bot] commented 4 months ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.