deepgram / deepgram-python-sdk

Official Python SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
179 stars 48 forks source link

Bugfix: manage resources properly #305

Closed tomprimozic closed 4 months ago

tomprimozic commented 4 months ago

Proposed changes

Manage multi-threaded resources with Python's built-in syntax, to avoid deadlocks.

Types of changes

What types of changes does your code introduce to the community Python SDK? Put an x in the boxes that apply

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

minimal reproducible example:

import time

from deepgram import DeepgramClient, LiveOptions

deepgram_client = DeepgramClient(api_key=API_KEY)
deepgram_connection = deepgram_client.listen.live.v("1")

deepgram_connection.start(LiveOptions())

time.sleep(30)    # Deepgram will close the connection after 10-15s of silence, followed with another 5 seconds for a ping

print('deadlock!')
try:
  deepgram_connection.finish()
finally:
  print('no deadlock...')

output:

Exception in _listening: received 1011 (internal error) Deepgram did not receive audio data or a text message within the timeout window. See https://dpgr.am/net0001; then sent 1011 (internal error) Deepgram did not receive audio data or a text message within the timeout window. See https://dpgr.am/net0001
Exception in thread Thread-5 (_listening):
Traceback (most recent call last):
  File "/Users/tom/my/lib/mambaforge/envs/dev3/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/Users/tom/my/lib/mambaforge/envs/dev3/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/tom/my/lib/mambaforge/envs/dev3/lib/python3.10/site-packages/deepgram/clients/live/v1/client.py", line 146, in _listening
    message = self._socket.recv()
  File "/Users/tom/my/lib/mambaforge/envs/dev3/lib/python3.10/site-packages/websockets/sync/connection.py", line 201, in recv
    raise self.protocol.close_exc from self.recv_events_exc
websockets.exceptions.ConnectionClosedError: received 1011 (internal error) Deepgram did not receive audio data or a text message within the timeout window. See https://dpgr.am/net0001; then sent 1011 (internal error) Deepgram did not receive audio data or a text message within the timeout window. See https://dpgr.am/net0001
Exception in _processing: received 1011 (internal error) Deepgram did not receive audio data or a text message within the timeout window. See https://dpgr.am/net0001; then sent 1011 (internal error) Deepgram did not receive audio data or a text message within the timeout window. See https://dpgr.am/net0001
deadlock!
dvonthenen commented 4 months ago

There shouldn't be a deadlock with the mutex there. The mutex is to prevent multiple threads from sending messages at the same time.

Do you believe you are seeing this for async or threaded client? If it's for the async, this https://github.com/deepgram/deepgram-python-sdk/issues/301 is potentially a known issue that might appear to cause this. Do you have logs that we can look at?

tomprimozic commented 4 months ago

I added a minimal reproducible example.

The deadlock happens when there's an exception raised by the self._socket.send(...) method, because the socket has already been closed.

The with ...: statement will release the lock even in case of exceptions (similar to a finally: block).

dvonthenen commented 4 months ago

Thanks for reporting the issue with the example! That was super helpful! I appreciate it a ton!

This should address the issue without removing the send mutex: https://github.com/deepgram/deepgram-python-sdk/pull/306

tomprimozic commented 4 months ago

Sure but you should still make this change. This code

lock.acquire()
do_something_that_might_cause_an_exception()
lock.release()

is just wrong (see Python official docs here).

Also, your PR further complicates the algorithm, and doesn't even use locks properly!

(Also, my code doesn't remove the send mutex, it just uses it correctly.)

dvonthenen commented 4 months ago

Ahhhh shoot.... flipping back and forth between languages is really messing with me. You are right.

EDIT: Let me re-think the exception cases though something still isn't right with it.

dvonthenen commented 4 months ago

Thanks for your PR!