Azure / azure-uamqp-python

AMQP 1.0 client library for Python
MIT License
57 stars 48 forks source link

Keepalive thread continues running after main thread has stopped #197

Closed terricain closed 2 years ago

terricain commented 3 years ago

Describe the bug If the ServiceBusSender is not closed before the main thread exits, the program will hang as the keepalive thread continues to run. Whilst closing resources does solve this, marking the keepalive thread as a daemon thread allows programs to exit regardless of cleanup

To Reproduce Steps to reproduce the behavior:

from azure.servicebus import ServiceBusClient, ServiceBusMessage

c: ServiceBusClient = ServiceBusClient.from_connection_string('Endpoint=...')
pub = c.get_queue_sender('celery')
msg = ServiceBusMessage("test1234")
pub.send_messages(msg)
print('Sent messages, should exit here')
# instead it hangs

Expected behavior Program to exit

Additional context I know if you add self._keep_alive_thread.daemon = True before https://github.com/Azure/azure-uamqp-python/blob/master/uamqp/client.py#L262 the thread start, this problem is solved, but I'm not sure what side effects this could have.

If this is an acceptable solution then I'm happy to make a PR for this.

yunhaoling commented 3 years ago

hey @terrycain , thanks for the findings. I'll look into this and see if daemon = True could help resolve the issue without side effects

terricain commented 3 years ago

@yunhaoling any progress with this?

yunhaoling commented 3 years ago

hey @terrycain, I confirmed that setting self._keep_alive_thread to be a daemon thread would prevent the program from hanging if the service bus sender or client is not closed properly.

According to the description in python doc:

Note Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.

I believe here as part of the amqp protocol implementation, the network connection should be gracefully shutdown -- the underlying library uamqp depends on is C based, if resources are not deallocated properly, it might trigger segmentation fault.

The approach of using context manager or manually calling close method are the best practices.

@annatisch, do you have any thoughts on whether or not to set the keep_alive_thread to be a daemon thread?

JonasKs commented 2 years ago

Hi, any status on this @yunhaoling ?
In general, containers should be allowed to crash and be restarted by k8s, but with this thread hanging, a container will never crash and not restart properly, which is very unfortunate.

At the moment the workaround is to monitor processes with health checks, but if there is a better way to do this please let me know. (A few suggestions on SO here)

EDIT: Don't seem like tihs is the issue in the async version. However, we seem to have the same end result. I'll have to investigate more.

yunhaoling commented 2 years ago

hey @JonasKs , I have a PR https://github.com/Azure/azure-uamqp-python/pull/293 out for the update, I'm going to test it and if everything works fine it will be shipped in our next release.

JonasKs commented 2 years ago

Hi @yunhaoling , we patched our containers to use azure-servicebus==7.4.0, which again forces version 1.5.0 of this package.

We experienced an exception tonight, and our container finally properly crashed and restarted as expected. It was no longer in a zombie state.

I believe this ticket can be closed. Thank you!

yunhaoling commented 2 years ago

closing the issue as it has already been fixed