aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
14.94k stars 1.99k forks source link

asyncio SSL transport error log after session.close() called from atexit #7328

Closed alexhsamuel closed 1 year ago

alexhsamuel commented 1 year ago

Describe the bug

asyncio logs Fatal error on SSL transport at level ERROR on Python shutdown, after session.close() is called from an atexit handler.

To Reproduce

# Client wrapper.  Would be in its own module.  Use of aiohttp is
# implementation detail.

import aiohttp
import atexit
import functools

@functools.cache
def _get_session():
    session = aiohttp.ClientSession()

    async def _close(session):
        await session.close()
        assert session.closed

    atexit.register(lambda: asyncio.run(_close(session)))

    return session

class Client:
    async def get(self):
        async with _get_session().get("https://google.com") as rsp:
            return rsp.status

#-------------------------------------------------------------------------------
# Top-level app.

import asyncio
import logging

async def main():
    client = Client()
    print(await client.get())

if __name__ == "__main__":
    logging.basicConfig()
    asyncio.run(main())

Expected behavior

No logging on shutdown.

Logs/tracebacks

Program returns exit status 0 but logs at shutdown.

$ python app1.py
200
ERROR:asyncio:Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7f1b2ed58f70>
transport: <_SelectorSocketTransport closing fd=6>
Traceback (most recent call last):
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/selector_events.py", line 924, in write
    n = self._sock.send(data)
OSError: [Errno 9] Bad file descriptor

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/sslproto.py", line 690, in _process_write_backlog
    self._transport.write(chunk)
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/selector_events.py", line 930, in write
    self._fatal_error(exc, 'Fatal write error on socket transport')
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/selector_events.py", line 725, in _fatal_error
    self._force_close(exc)
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/selector_events.py", line 737, in _force_close
    self._loop.call_soon(self._call_connection_lost, exc)
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/base_events.py", line 753, in call_soon
    self._check_closed()
  File "/home/alex/.pyenv/versions/3.10.11/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

Python Version

$ python --version
Python 3.10.11

aiohttp Version

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.8.4
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: 
Author-email: 
License: Apache 2
Location: /home/alex/dev/apsis/env/lib/python3.10/site-packages
Requires: aiosignal, async-timeout, attrs, charset-normalizer, frozenlist, multidict, yarl
Required-by: apsis

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 5.2.0
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /home/alex/dev/apsis/env/lib/python3.10/site-packages
Requires: 
Required-by: aiohttp, sanic, yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.9.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /home/alex/dev/apsis/env/lib/python3.10/site-packages
Requires: idna, multidict
Required-by: aiohttp

OS

Arch Linux

Related component

Client

Additional context

My intent is to use aiohttp in a client library as implementation detail. Moving management of the ClientSession to the library's users is not an option. If there is a better way to shut down aiohttp cleanly at program exit without involving the caller, I'm not tied to atexit.

I've read #1925 and tried adding the all_is_lost mechanism described here after shutdown in my atexit function. This detects one transport and then hangs indefinitely; neither connection_lost nor eof_received is ever entered.

One additional, rather mysterious point: in my larger application (though not with the sample code above), I see different behavior depending on whether .pyc files are present/fresh or not. No joke—I've verified this pretty carefully.

Thank you in advance for your help!

Code of Conduct

Dreamsorcerer commented 1 year ago

I don't think what you're asking for is possible in a reasonable manner.

The well-engineered solution is to get the user to manage the lifecycle (if it is tied to something they are using). That doesn't mean you need to expose ClientSession to them directly, but your API (Client in your example) should implement __enter__/__aexit__ or similar and pass the calls to ClientSession.

Using atexit seems like a very wrong way to close a session for a library. A library shouldn't assume the lifecycle of an application, so it could be the case that an application is left running for a long time after they've stopped using your Client, but without the session getting closed (e.g. a web application that runs for weeks, but only uses your client briefly on startup).

alexhsamuel commented 1 year ago

The well-engineered solution is to get the user to manage the lifecycle

I can see how this is appealing from a low-level library creator's standpoint, but it is definitely not good engineering from the perspective of software higher up in the stack. Exposing all the specific resource management requirements of (transitive) libraries breaks encapsulation, complicates usability, and renders some software architectures impossible.

Practically no software component takes 100% explicit responsibility for managing all the resources it uses. (What if Python required you to pass in a logger and clean it up? What if Python required you to have an allocator handy every time you instantiate an object?) Allocating resources from a global source and having them released automatically at process shutdown is an extremely practical, time-honored, and widely-used mechanism for dealing with many resources.

May I please ask, what is the technical obstacle to this in aiohttp? I'm using a shared session as the docs instruct, I'm causing its close() method to be called as the docs instruct, and I've tried the various sleep/connection lost-counting workarounds in #1925. I'd be curious to know why there is still a task scheduled that writes to a socket.

Dreamsorcerer commented 1 year ago

Exposing all the specific resource management requirements of (transitive) libraries breaks encapsulation, complicates usability, and renders some software architectures impossible.

I'd argue that using atexit to cleanup library details is breaking encapsulation and complicating usability. If you want a system where the user doesn't handle the lifecycle of the application, then you want to build a framework, at which point you have full control of the application lifecycle and can easily close the session at application shutdown (e.g. like what aiohttp does for the server component, using web.run_app() to run the application).

What if Python required you to pass in a logger and clean it up?

You seem to be talking about memory allocation, which is handled by garbage collection, not OS resources. If you use a FileHandler, for example, in your logging, you are expected to close() it. For example, see the end of this example: https://docs.python.org/3/howto/logging-cookbook.html#sending-logging-messages-to-email-with-buffering

May I please ask, what is the technical obstacle to this in aiohttp?

Nothing. The problem is the architecture of your library conflicting with the architecture of asyncio. The primary issue in your example is that your second call to asyncio.run() will make the .close() call in a new loop, not the loop the session was created in. This will just error or fail to do anything useful.

Again, a well-engineered solution would ensure that the session is created and closed within a single asyncio.run() call. The design of asyncio is that your entire async application should be run and completed in that one call. You could change your example to use asyncio.new_event_loop() and then use loop.run_until_complete() to run it in the same loop (or maybe do something with asyncio.Runner). But, that involves messing with the low-level API, which is not recommended. It would also require the user to conform to this weirdness too, which would be far more complex and more of a burden than simply asking them to add async with to their code (like just about every other async library that uses OS resources).

alexhsamuel commented 1 year ago

I respectfully disagree. An async context manager is only one of many legitimate models for resource management; there are others, including free-on-shutdown, and many others. I am not willing to impose this model on all libraries that use HTTP or might possibly use it in the future, and on their own transitive users.

I also respectfully ask that you stop implying that anything other than your preferred architecture is poorly engineered. This is simply not the case.

The problem is the architecture of your library conflicting with the architecture of asyncio.

Not really; it conflicts with aiohttp. I'll choose a different HTTP library. Thanks for your guidance, in any case.

Dreamsorcerer commented 1 year ago

including free-on-shutdown, and many others.

Calling .close() at the end of your application is a valid approach, but you need to expose that method to the user to allow them to use that approach.

I am not willing to impose this model on all libraries that use HTTP or might possibly use it in the future, and on their own transitive users.

You are actually removing control of the lifecycle from the user, which is restricting valid software architectures the user can use, instead of giving them freedom. If an async context manager is a valid way to handle the life cycle, why are you so insistent that a user should not be allowed to do this with your library?

The problem is the architecture of your library conflicting with the architecture of asyncio.

Not really; it conflicts with aiohttp. I'll choose a different HTTP library. Thanks for your guidance, in any case.

No, it is asyncio. All other async HTTP libraries will still have a .close() (or __aexit__()). All of them will have the exact same issue with your example. Same for DB libraries etc.

If you can find any popular async library, or anything in stdlib, that does not work like this, I'm happy to be proven wrong and learn about new architectures, but I'm pretty sure what you are asking for does not work well with the architecture of asyncio.

I'm curious why you're so bothered with this in the async version though, I noticed in your PR that you never bothered to close the requests session properly in the original code..

alexhsamuel commented 1 year ago

If you can find any popular async library, or anything in stdlib, that does not work like this, I'm happy to be proven wrong and learn about new architectures, but I'm pretty sure what you are asking for does not work well with the architecture of asyncio.

HTTPX works fine for me; it was a simple replacement.

I admit that graceful shutdown of external resources on process termination is difficult, especially when faced with poorly-defined shutdown order. It is, however, possible; I've done it myself in various contexts. Part of it is the fine art of choosing which errors are safely ignorable at shutdown (only).

Dreamsorcerer commented 1 year ago

HTTPX works fine for me; it was a simple replacement.

I can assure you with the same example, that you are not closing the httpx session correctly. It is using asyncio sockets/transports, and those are tied to an event loop. Some implementation detail may mean that you don't see errors when calling .close(), but it's definitely not doing what you asked for.

Part of it is the fine art of choosing which errors are safely ignorable at shutdown (only).

Many of us like to close things cleanly and not have any errors. This requires a library to expose controls for it's lifecycle, which is why I keep suggesting you do this. Many users who just don't care about this will just ignore it and not close it explicitly (such as your original requests code), aiohttp (and httpx etc.) should still run adequately.

Exposing the controls gives users the freedom to manage the lifecycle however they like, or not at all.

alexhsamuel commented 1 year ago

If I write a library that uses a library that uses a library that may one day decide to use aiohttp for something (I donno, log metrics), then all these libraries have to be async context managers. It follows pretty quickly that every library has to be an async context manager. I don't wish to go there. I don't consider abruptly closing a socket on process shutdown necessarily to be a problem; Linux, TCP, TLS, and every server handle this just fine.

We're unlikely to converge, so I'm signing off. Thanks again for responding, anyway.

Dreamsorcerer commented 1 year ago

library that may one day decide to use aiohttp

Or any async HTTP library, then yes. But, that library would also need to change it's API to become async, so it's not like it'll just suddenly happen without you noticing. asyncio and sync libraries are sort of like separate ecosystems that can't be mixed easily.

I don't consider abruptly closing a socket on process shutdown necessarily to be a problem

Then don't bother trying to close the session at all? I'm still not sure why you refuse to let any of your users be concerned about this though. Again, you're only restricting users, which seems to be the opposite of what you were claiming originally.

alexhsamuel commented 1 year ago

Then don't bother trying to close the session at all?

Then I get other asyncio shutdown errors.

I am a user of my library, I own main(), and I want to release resources automatically at shutdown without errors on the console, instead of making everything in the stack an async context manager. This doesn't seem to be possible with aiohttp.

Dreamsorcerer commented 1 year ago

Then I get other asyncio shutdown warnings.

The warnings are related to not closing the session properly. If you don't care, then ignore the warnings...

I am a user of my library, I own main(), and I want to release resources automatically at shutdown instead of making everything in the stack an async context manager. This doesn't seem to be possible with aiohttp.

If you are the user, then you can call .close() at the end of your application. Something like:

async def main():
    client = ...
    try:
        ...
    finally:
        client.close()

If you were insistent on doing it with atexit, you could again use the low-level loop.run_until_complete(). The options are there to architect it however you like, if the library exposes the methods to do so. The current approach used, unfortunately, makes all options impossible for the user.

Dreamsorcerer commented 1 year ago

One last thought, you might be able to do it like this (though I still strongly discourage it):

async def _get_session():
    loop = asyncio.get_running_loop()
    ...
    atexit.register(lambda: loop.run_until_complete(session.close()))

The closure should ensure you are using the same loop to run the close() method. atexit should only occur outside of a running loop, so it should work OK.

alexhsamuel commented 1 year ago

Thanks. This works only if the loop was dropped without closing, not if it was already closed. So, it works with something like,

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

but not with

if __name__ == "__main__":
    asyncio.run(main())  # this closes the loop :(

I did some more reading; my understanding of the issue is that the fds are registered to a particular loop, so any other loop won't get wakeups for them, and therefore can't really do much with them. Is that correct?

Seems to me what we really need is lifecycle callbacks for event loops. I don't see any standard way to do it, but here's a monkey solution. I'm sure there are all sorts of problems with it, so consider it a sketch.

# Installs a custom event loop policy to add `on_close()` lifecycle functions to
# an event loop.

import asyncio

def _install():
    """
    Monkeypatches the event loop policy to augment new event loops with
    `on_close` behavior.
    """
    policy = asyncio.get_event_loop_policy()
    old_new_event_loop = policy.new_event_loop

    def new_event_loop():
        """
        Monkeypatches a new event loop with an `on_close()` method to
        register awaitables to call on close (in reverse order), and wraps its
        `close()` method to call these.
        """
        loop = old_new_event_loop()
        loop.__on_close = []
        old_close = loop.close

        def on_close(aw):
            loop.__on_close.append(aw)

        def close():
            for aw in reversed(loop.__on_close):
                loop.run_until_complete(aw)
            return old_close()

        loop.on_close = on_close
        loop.close = close
        return loop

    policy.new_event_loop = new_event_loop
    asyncio.set_event_loop_policy(policy)

_install()

#-------------------------------------------------------------------------------
# Client wrapper.  Would be in its own module.  Use of aiohttp is
# implementation detail.

import aiohttp
import functools

@functools.cache
def _get_session():
    session = aiohttp.ClientSession()
    asyncio.get_event_loop().on_close(session.close())
    return session

class Client:
    async def get(self):
        async with _get_session().get("https://google.com") as rsp:
            return rsp.status

#-------------------------------------------------------------------------------
# Top-level app.

import logging

async def main():
    client = Client()
    print(await client.get())

if __name__ == "__main__":
    logging.basicConfig()
    asyncio.run(main())
Dreamsorcerer commented 1 year ago

Thanks. This works only if the loop was dropped without closing, not if it was already closed.

Yes, that'd a be a problem, wasn't sure if it would definitely work or not.

Seems to me what we really need is lifecycle callbacks for event loops. I don't see any standard way to do it, but here's a monkey solution. I'm sure there are all sorts of problems with it, so consider it a sketch.

I'm not sure there's any effort to add anything like this to asyncio, but you're welcome to ask in cpython if there's interest. Monkey patching asyncio is definitely a lot more likely to cause issues to end users than the other solutions I've suggested (async with etc.).