psf / requests

A simple, yet elegant, HTTP library.
https://requests.readthedocs.io/en/latest/
Apache License 2.0
52.15k stars 9.32k forks source link

requests.Session doesn't properly handle closed keep-alive sessions #4664

Open eriol opened 6 years ago

eriol commented 6 years ago

Hello, this was reported for requests 2.18.4 on Debian BTS[¹] by Jonathan Lynch, since it's not Debian specific I'm forwarding here:

When a server reaps a keep-alive session it sends a FIN packet to the client. Normally, requests handles this fine and rebuilds the session on the next request. However, there is an edge case involving network latency that is not properly handled:

If python sends a request at roughly the same time as the server closes the session, then the server will send a RST (as the session is closed). Python receives this RST on what it thought was a valid session and throws an error:

requests.exceptions.ConnectionError: ('Connection aborted.',
RemoteDisconnected('Remote end closed connection without response',))

The reason I consider this a bug is because python received the FIN packet before it received the RST. As a result, it shouldn't be surprised when the connection is subsequently aborted. It is an edge case, but the client has enough information available to it that it could have handled it correctly.

The workaround is to set max_retries on the Session via an HTTPAdaptor, but I believe the correct behavior when the FIN is received is to rebuild the session and re-send any requests that were in-flight (rather than throwing an error). Requests correctly handles the FIN packet if there are no in-flight requests, but if there are in-flight requests it ignores it and instead throws an error.

I will ask Jonathan to continue the discussion here. Thanks!

[¹] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=899406

sigmavirus24 commented 6 years ago

Hi there @eriol, Thanks for passing this on. Can you share links to any custom patches Debian has written and is applying to Requests at this time?

Jonathan, can you share your system information (as requested from all bug reporters) as well as the full traceback of what you're trying to do.

The short answer may be that we can't help in this particular case since we're so far removed from the networking stack, unfortunately.

jimethn commented 6 years ago

Hi @sigmavirus24 , I was able to reproduce this on multiple systems. It first surfaced in our dockerized automated test environment (not sure what OS the containers are based on), and I was able to reproduce it running the same code locally on my Mac. I was able to eventually simply the repro down to:

import requests
from time import sleep

import logging
logging.basicConfig(level=logging.DEBUG)

s = requests.Session()
s.verify = False  # self-signed cert

while True:
    s.get('https://the-server:8443')
    sleep(5)

The (nodejs) server being hit by the code prunes keepalives older than 5 seconds, so having python hit it at 5 second intervals with only ~30ms of network latency was able to consistently reproduce the issue in only a few loops.

Setting the sleep to either 4 seconds or 6 seconds removed the issue: at 4 seconds, the session was always refreshed before the server's prune interval, and at 6 seconds the server always pruned the session before the next request, causing python to (correctly) rebuild it. It was when the server pruned the session at the "same time" (as defined by network latency) as python reused it that trouble arose.

sigmavirus24 commented 6 years ago

Right, so that makes me suspect this is an issue in urllib3 more specifically than in Requests. It would still be helpful to have the traceback from that exception if you can share it @jimethn

jimethn commented 6 years ago

Sorry @sigmavirus24 , I think you are right in that the underlying urllib3 is responsible. Here is the stack trace, as requested:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 383, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.6/site-packages/urllib3/util/retry.py", line 357, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.6/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 383, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 521, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 490, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
eriol commented 6 years ago

Hello @sigmavirus24, many thanks for the fast reply and sorry if I did not manage to reply earlier. Since version 2.18.1-1 I dropped all the patches for requests, but I still have some patches on urllib3:

https://sources.debian.org/patches/python-urllib3/1.22-1/

the biggest one is to not use the vendored six module.

bluworld commented 6 years ago

@eriol is that patch the solution to this issue? If not, what did you do to get the code to work?

eriol commented 6 years ago

@bluworld unfortunately no, the mentioned patches are unrelated to this issue.

reederz commented 6 years ago

Maybe this is not the best place to ask but do you know if anybody submitted the issue to urllib3? I personally couldn't find anything. I'm being hit by this issue quite a lot with apache2 web server where the default keep-alive timeout is 5 seconds.

kharmabum commented 4 years ago

For what it's worth, I don't think there's any workaround to this besides catching the ConnectionError explicitly - urllib3.Retry has never retried on a ConnectionError (as far I can tell) cf https://github.com/urllib3/urllib3/blob/master/src/urllib3/util/retry.py#L9 cc @eriol

keuko commented 4 years ago

Hi,

I think i found a solution and created pull requests for urllib3 -> urllib3/urllib3#1911 After it will be merged ( I hope ) , requests should raise MaxRetries error as it should , because they are setting max_retries = 0 to retry object of urllib3 by default - > this should be discussed if better not to leave default urllib3 value which is 3 , or pass 1.

More info -> https://bugs.python.org/issue3566 , https://hg.python.org/cpython/rev/eba80326ba53 , https://bugs.python.org/issue41345

jakermx commented 3 years ago

It is because the OS doent have the tcp keep alive enabled, causing that the server drop the connection

I did this in Linux

**import socket
from urllib3.connection import HTTPConnection
HTTPConnection.default_socket_options = HTTPConnection.default_socket_options + [
    (socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
    ,(socket.SOL_TCP, socket.TCP_KEEPIDLE, 45)
    ,(socket.SOL_TCP, socket.TCP_KEEPINTVL, 10)
    ,(socket.SOL_TCP, socket.TCP_KEEPCNT, 6)
     ]**

from requests import session
self.session = session()

And no more ConnectionReset, ConnectionTimeOut or Even ConnectionReadTimeOut

malthe commented 3 years ago

@jakermx yes – but your TCP Keep-Alive settings can only reduce the issue, not completely do away with it (there is always a window, however small, where the connection might be closed).

keuko commented 3 years ago

@malthe Well, so, is this issue "somehow" resolveable ? In openstack where tons of requests are send to API, from time to time there is an issue related to this ..

I'm not saying it's regular bug, but how can be this fixed , is it possible ?

Even if this issue (for example) in openstack is in small amount, from time to time something just fail ..

malthe commented 3 years ago

@keuko short of writing a custom HTTPAdapter (overriding the send method) – I don't know.

I use code such as the following and attach the adapter using Session.mount (see https://requests.readthedocs.io/en/master/user/advanced/#transport-adapters).

class CustomHTTPAdapter(HTTPAdapter):
    def send(self, *args, **kwargs):
        for attempt in range(ATTEMPTS):
            if attempt > 0:
                LOGGER.info("Retrying after connection error")
            try:
                return super().send(*args, **kwargs)
            except ConnectionError as exc:
                LOGGER.info("Connection error: %s", exc)
                saved = exc
                continue

        raise saved
jakermx commented 3 years ago

I agree when you said that the Session is not handling as should the connections, and I think that the reasson of the adapters...the RFC for "Persistent" Connections is very weird...it defines that the client should start sending tcp keep alive packets after 2 hours of no data received.. for a Service Oriented Server....it is not costable having idle sessions. I.E. A video Streming server...chaturbate for say something.... or netflix, users navigate throu the options and then picjk a video or room, but too many times , we a, as userrs let the video playing and fall asleep..so...we are consuming resourses that at the end we are causining our provider to spend more so they will increaase thier prices and so on....

So in the case, you software raise an exception instead of handling as should the connection...it is because the server side, maybe no the server software it, the load balancers, sanboxes or any other box in tyhe middle.

Thats Why I SUGGESTED to enable a lower layer connection guard that wull ensure that the session...well the last request, keep open until the server says....WTF and clloses thw connection properly instead of just dropping it.

so...it is work of no just Requests team...it really depens on 7 teams...one per OSI layer, actually 4, heheh.

SoI am just tryiong to let you know that we can be creatives and find solutinos, and try to understnad that is the Module is called Request , they handle requests...and suport itself on other specific modules....

XD

sirkonst commented 1 year ago

@keuko short of writing a custom HTTPAdapter (overriding the send method) – I don't know.

I use code such as the following and attach the adapter using Session.mount (see https://requests.readthedocs.io/en/master/user/advanced/#transport-adapters).

class CustomHTTPAdapter(HTTPAdapter):
    def send(self, *args, **kwargs):
        for attempt in range(ATTEMPTS):
            if attempt > 0:
                LOGGER.info("Retrying after connection error")
            try:
                return super().send(*args, **kwargs)
            except ConnectionError as exc:
                LOGGER.info("Connection error: %s", exc)
                saved = exc
                continue

        raise saved

Be careful with this workaround. Because ConnectionError is just a wrapped for MaxRetryError and MaxRetryError can contain connection timeout error and read timeout error and others. If you want to set different retries for connection and read errors or other errors this workaround cannot help you.

rishav667 commented 8 months ago

@sigmavirus24 can anyone help me with the below error Error occurred ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) I am using requests==2.28.2 version and while logging into tableau server using rest api the error is coming. As I am first creating the session and then doing the post request.