aio-libs / aiohttp

Asynchronous HTTP client/server framework for asyncio and Python
https://docs.aiohttp.org
Other
15.11k stars 2.02k forks source link

aiohttp.ClientSession() does not send TCP FIN #4685

Open vmisupov opened 4 years ago

vmisupov commented 4 years ago

🐞 Describe the bug I tried to create async web client and found that my OSX 10.14 netstat show many connections in TIME_WAIT-state. I started Wireshark and saw that TCP sessions donot finish correctly. Client should send TCP-FIN and get TCP-ACK from server, but client does not send.

💡 To Reproduce Start python code and when script finish - see network connections to you local web-server(192.168.1.1 in my case) in netstat output.

xeon$ netstat -na -p tcp | grep 192.168.1.1 tcp4 0 0 192.168.107.29.58842 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58843 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58844 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58845 192.168.1.1.80 TIME_WAIT
tcp4 0 0 192.168.107.29.58846 192.168.1.1.80 TIME_WAIT


import asyncio
import random
import time
import aiohttp

STATIC_ADDR='http://192.168.1.1/'

# make one HTTP request and print result to console
async def request_webserv(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            print(resp.status)
            print(await resp.text())

# make worket, which get item from queue and make request_webserv
async def worker(name, queue):
    while True:
        # Get a "work item" out of the queue.
        url = await queue.get()
        await request_webserv(url)
        queue.task_done()

# create queue and fill it with 20 equal links, then start 3 workers 
# and they perform 20 HTTP requests
async def main():
    queue = asyncio.Queue(maxsize=100)
    total_time = 0

    for _ in range(20):
        queue.put_nowait(STATIC_ADDR)

    # Create worker tasks to process the queue concurrently.
    tasks = []
    for i in range(5):
        task = asyncio.create_task(worker(f'worker-{i}', queue))
        tasks.append(task)

    started_at = time.monotonic()
    await queue.join()
    total_spent_for = time.monotonic() - started_at

    # Cancel our worker tasks.
    for task in tasks:
        task.cancel()
    # Wait until all worker tasks are cancelled.
    await asyncio.gather(*tasks, return_exceptions=True)

asyncio.run(main())

For example:

  1. Have certain environment
  2. Run given code snippet in a certain way
  3. See some behavior described

Add these steps below this comment: -->

💡 Expected behavior

I expect that

📋 Logs/tracebacks

📋 Your version of the Python

$ python --version
Python 3.7.6

📋 Your version of the aiohttp/yarl/multidict distributions

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.6.2
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author: Nikolay Kim
Author-email: fafhrd91@gmail.com
License: Apache 2
Location: /usr/local/lib/python3.7/site-packages
Requires: async-timeout, attrs, chardet, yarl, multidict
Required-by: 
$ python -m pip show multidict
Name: multidict
Version: 4.7.5
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /usr/local/lib/python3.7/site-packages
Requires: 
Required-by: yarl, aiohttp
$ python -m pip show yarl
Name: yarl
Version: 1.4.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /usr/local/lib/python3.7/site-packages
Requires: multidict, idna
Required-by: aiohttp

📋 Additional context

client library
socketpair commented 4 years ago

In general, FIN is sent only when the last reference on a socket file descriptor is voluntarily closed by close(). In other conditions, RST is sent. Try gc.collect() at the end of the program. If it helps, then .close() is missing somewhere. If still not, then there is something heavier, like socket object leakage.

vmisupov commented 4 years ago

gc.collect() did not help.. Additionaly checked on Ubuntu 16.04 LTS - behaviour is the same.

netstat -an -p tcp | grep .80
tcp4       0      0  192.168.1.132.56201    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56198    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56197    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56199    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56200    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56204    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56203    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56202    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56205    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56207    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56206    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56209    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56208    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56210    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56213    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56212    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56214    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56215    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56216    192.168.1.1.80         TIME_WAIT  
tcp4       0      0  192.168.1.132.56217    192.168.1.1.80         TIME_WAIT 
vmisupov commented 4 years ago

In general, FIN is sent only when the last reference on a socket file descriptor is voluntarily closed by close(). In other conditions, RST is sent. Try gc.collect() at the end of the program. If it helps, then .close() is missing somewhere. If still not, then there is something heavier, like socket object leakage.

Проверил на Ubuntu, проверил что не работает gc.collect(). Возможно это поведение сессии так и задумано? Я посмотрел код ClientSession, он ссылается на class TCPConnector, у которого есть параметр force_close=False(который судя по описанию и отвечает за закрытие соединения после каждого коннекта).

tonyroberts commented 10 months ago

I was looking into what I think is the same issue and I noticed something. In asyncio.proactor_events._ProactorBasePipeTransport, socket.shutdown() is called in _call_connection_lost, but in asyncio.selector_events._ProactorBasePipeTransport._SelectorTransport it is closed without shutdown() being called first.

Could this explain why the aiohttp client is not sending FIN on linux/mac, even though the connection is closed? I think that close just sends an RST packet, whereas shutdown sends a FIN packet.

_ProactorBasePipeTransport

    def _call_connection_lost(self, exc):
        if self._called_connection_lost:
            return
        try:
            self._protocol.connection_lost(exc)
        finally:
            # XXX If there is a pending overlapped read on the other
            # end then it may fail with ERROR_NETNAME_DELETED if we
            # just close our end.  First calling shutdown() seems to
            # cure it, but maybe using DisconnectEx() would be better.
            if hasattr(self._sock, 'shutdown') and self._sock.fileno() != -1:
                self._sock.shutdown(socket.SHUT_RDWR)
            self._sock.close()
            self._sock = None
            server = self._server
            if server is not None:
                server._detach()
                self._server = None
            self._called_connection_lost = True

_SelectorTransport

    def _call_connection_lost(self, exc):
        try:
            if self._protocol_connected:
                self._protocol.connection_lost(exc)
        finally:
            self._sock.close()
            self._sock = None
            self._protocol = None
            self._loop = None
            server = self._server
            if server is not None:
                server._detach()
                self._server = None

I've tried to fix the problem I've been having by passing a custom connector to aiohttp.ClientSession that always calls shutdown on the underlying socket before it's closed. That seems to have worked in my case and I now have no connections left in the TIME_WAIT state, but I'm not really familiar enough with the internals of aiohttp to know exactly where a similar fix should be applied, or whether it's something that needs to be fixed in asyncio... so I thought I'd leave this comment in case it might be of any help.

x0day commented 2 months ago

@bdraco could you check this issue?

I encountered some cleanup issues when using aiohttp and uvloop in some high-concurrency scenarios. also I do close job after response like this.

session = aiohttp.ClientSession()
resp = session.get(...)
try:
    ...
finally:
   resp.release()
   await resp.wait_for_close()
   await session.close()

image

When I started to troubleshoot the problem, I found that there were a lot of TIME_WAIT connections on the server, just like mentioned above. I am not sure if the problem I encountered is related to the above mentioned problem, and the above problem can still be reproduced now.

bdraco commented 2 months ago

@x0day

It shouldn't be possible for file descriptors to get reused at aiohttp's level there is a bug in cpython or uvloop. That assumes nothing in aiohttp is bypassing the normal socket APIs (which I have not found)

We will need a minimum reproducer that could be turned into a test for this to better understand the issue.

This seems like a different issue.

Dreamsorcerer commented 2 months ago

I can see the TIME_WAIT as described above, but adding sock.shutdown() into _call_connection_lost() doesn't seem to resolve it. I'm not really familiar enough with the socket code to know how it should work.