Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.53k stars 2.76k forks source link

aiohttp.ClientSession connection leaking in SharedTokenCacheCredential when passing session in DefaultAzureCredential() #36959

Closed arvind-sarvam closed 2 days ago

arvind-sarvam commented 3 weeks ago

Describe the bug I'm using container client to perform operations on blob storage. The container client is initialized as:

class AzureStorage:

    def __init__(self, account_url, container_name):
        self.credential = DefaultAzureCredential(session=aiohttp.ClientSession())

        self.container_client = ContainerClient(
            account_url=account_url,
            container_name=container_name,
            credential=self.credential,
        )

    async def shutdown(self):
        await self.credential.close()
        await self.container_client.close()

)

On application shutdown, I invoke shutdown() that closes the client & credential:

However, even after shutdown() when I print objects (using gc.get_objects()), I can see that there is a ClientSession object still held by SharedTokenCacheCredential. I confirmed it with objgraph. client_session

On debugging, I found that the session stored inside self._client_kwargs["session"] is not getting closed here

To Reproduce Run this script:

import asyncio
import gc
import tracemalloc
from collections import defaultdict

import aiohttp
import objgraph
from azure.identity.aio import DefaultAzureCredential
from azure.storage.blob.aio import ContainerClient

class AzureStorage:

    def __init__(self, account_url, container_name):
        self.credential = DefaultAzureCredential(session=aiohttp.ClientSession())

        self.container_client = ContainerClient(
            account_url=account_url,
            container_name=container_name,
            credential=self.credential,
        )

    async def shutdown(self):
        await self.credential.close()
        await self.container_client.close()

def get_objects_in_memory(trace_type=None, draw=False):
    # Start tracing memory allocations
    tracemalloc.start()

    # Force a garbage collection to get accurate results
    gc.collect()

    # Get all objects tracked by the garbage collector
    objects = gc.get_objects()

    # Create a defaultdict to store counts of each type
    type_counts = defaultdict(int)

    # Count objects by type and store ClientSession objects
    for obj in objects:
        obj_type = type(obj).__name__
        type_counts[obj_type] += 1
        if obj_type == trace_type:
            if draw:
                objgraph.show_backrefs([obj], filename="client_session.png")

    return type_counts

async def main():

    storage = AzureStorage("<account-url>", "<container-name>")

    print("\nCurrent ClientSession objects count: ", get_objects_in_memory(trace_type="ClientSession")["ClientSession"])

    print("\n\n\n\n Closing connections")

    await storage.shutdown()

    print("\nCurrent ClientSession objects count: ", get_objects_in_memory(trace_type="ClientSession", draw=True)["ClientSession"]) # I expect this to be 0

if __name__ == '__main__':
    asyncio.run(main())

Expected behavior I expect all relevant client sessions to be closed on invoking shutdown()

Screenshots Added in-line above

Additional context When I add these lines to this

    async def close(self) -> None:
        """Close the credential's transport session."""
        if "session" in self._client_kwargs and self._client_kwargs["session"]:
            await self._client_kwargs["session"].close()
            self._client_kwargs["session"] = None
        if self._client:
            await self._client.__aexit__()  # type: ignore

I can see that this closes the connection properly, and GC count shows 0 which is expected. I'm not a 100% sure if this is the right fix, please check and let me know.

github-actions[bot] commented 3 weeks ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jalauzon-msft @vincenttran-msft.

xiangyan99 commented 3 weeks ago

Thanks for reaching out.

DefaultAzureCredential manages sessions for you.

Could you share the reason why you create a session instance when you create DefaultAzureCredential?

github-actions[bot] commented 3 weeks ago

Hi @arvind-sarvam. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

arvind-sarvam commented 3 weeks ago

Hi, we saw some connection leak inside the library in the session created per Request. It happens sort of randomly. For many requests, there is no problem, suddenly we see a ClientSession that is not closed. This also gave a lot of : "Unclosed client session" errors.

Due to the above, I chose to pass a session to DefaultAzureCredential. Another added benefit from this is performance gain for requests since this is long lived in nature.

On Wed, Aug 21, 2024, 9:33 PM github-actions[bot] @.***> wrote:

Hi @arvind-sarvam https://github.com/arvind-sarvam. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

— Reply to this email directly, view it on GitHub https://github.com/Azure/azure-sdk-for-python/issues/36959#issuecomment-2302448392, or unsubscribe https://github.com/notifications/unsubscribe-auth/BJD4O6CZ3EHVT26MWVHGLALZSS24NAVCNFSM6AAAAABM2KCZY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBSGQ2DQMZZGI . You are receiving this because you were mentioned.Message ID: @.***>

xiangyan99 commented 3 weeks ago

Thanks for the information.

Do you want to manage the lifecycle of the session object or not?

If you want to manage the lifetime of the session instance, you need to add session_owner=False when you create the client.

self.credential = DefaultAzureCredential(session=aiohttp.ClientSession(), session_owner=False)

and you need to close the session by yourself.

github-actions[bot] commented 3 weeks ago

Hi @arvind-sarvam. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

github-actions[bot] commented 2 weeks ago

Hi @arvind-sarvam, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!