Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.63k stars 2.84k forks source link

A Question: What happens when we never close the AMPQ connection? #36212

Closed cabal-daniel closed 4 months ago

cabal-daniel commented 4 months ago

Consider the following code

from azure.identity import DefaultAzureCredential
from azure.servicebus import ServiceBusClient, ServiceBusMessage
from datetime import datetime, timedelta

client = ServiceBusClient(
    fully_qualified_namespace='foobar',
    credential=DefaultAzureCredential(),
)

start = datetime.now()
sender_context = client.get_queue_sender(queue_name='hello')
sender = sender_context.__enter__()
checkpoint_1 = datetime.now()
sender.send_messages(ServiceBusMessage('hi'))
checkpoint_2 = datetime.now()
sender = sender_context.__exit__()
checkpoint_3 = datetime.now()

print(f'initiating took {(checkpoint_1 - start) / timedelta(milliseconds=1)}ms')
print(f'sending took {(checkpoint_2 - checkpoint_1) / timedelta(milliseconds=1)}ms')
print(f'closing took {(checkpoint_3 - checkpoint_2) / timedelta(milliseconds=1)}ms')

The results are

initiating took 1255.869ms
sending took 77.747ms
closing took 804.139ms

If a process never calls sender_context.__exit__() because the process gets SIGTERM-ed, would that cause problems?

github-actions[bot] commented 4 months ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

kashifkhan commented 4 months ago

Hi @cabal-daniel,

our typical usage pattern is to use the context manager ( with ) or simply managing the opening / closing of the client by yourself, so the code ends up like this.

It shouldn't impact anything but you will have sockets left open, other pieces of code that might be running. In general it is a good idea to call close just in case.

from azure.identity import DefaultAzureCredential
from azure.servicebus import ServiceBusClient, ServiceBusMessage
from datetime import datetime, timedelta

client = ServiceBusClient(
    fully_qualified_namespace='foobar',
    credential=DefaultAzureCredential(),
)

start = datetime.now()
sender_context = client.get_queue_sender(queue_name='hello')
checkpoint_1 = datetime.now()
sender.send_messages(ServiceBusMessage('hi')) # this will open up the connection if not open
checkpoint_2 = datetime.now()
sender = sender_context.close()
checkpoint_3 = datetime.now()
github-actions[bot] commented 4 months ago

Hi @cabal-daniel. Thank you for opening this issue and giving us the opportunity to assist. We believe that this has been addressed. If you feel that further discussion is needed, please add a comment with the text "/unresolve" to remove the "issue-addressed" label and continue the conversation.

cabal-daniel commented 4 months ago

@kashifkhan the problem is that opening the connection and closing the connection takes a considerable amount of time

initiating took 1255.869ms
sending took 77.747ms
closing took 804.139ms

For our pods, it seems easier to just keep the connection open and detect if the connection is closed on the other end then re-open so sending messages takes 77ms rather than 3000ms if we're using SB to build a DAG.

kashifkhan commented 4 months ago

@cabal-daniel thank you for that context. There are some things to note here

cabal-daniel commented 4 months ago

@kashifkhan thank you for this helpful information.

Could you comment on the following type of architecture for distributed compute?

  1. Start worker
  2. Open all the queue senders / AMPQ connections
  3. Worker does it's job for a long time
  4. SIGTERM
  5. Use the atexit library to close all the connections

Between steps 2 to 5, the workers could be alive for days or weeks.

kashifkhan commented 4 months ago

@cabal-daniel

In general the strategy looks fine, just wanted to add a few notes:

cabal-daniel commented 4 months ago

Got it -- thanks @kashifkhan ! That makes sense.