boto / botocore

The low-level, core functionality of boto3 and the AWS CLI.
Apache License 2.0
1.48k stars 1.08k forks source link

ResponseStreamingError not retried with urllib3 2.x #3132

Closed apmorton closed 6 months ago

apmorton commented 7 months ago

Describe the bug

When using download_fileobj to get an s3 object in an environment with urllib3 version 2.x a tcp connection reset while streaming the response body will not be retried.

This is retried correctly when using urllib3 version 1.x

Expected Behavior

The request should be retried according to the configured policy

Current Behavior

botocore.exceptions.ResponseStreamingError is raised to the caller after only a single request attempt.

Reproduction Steps

Given the following script that simulates an unreliable network connection:

import http.server
import logging
import socketserver
import threading
from http import HTTPStatus
from io import BytesIO

import boto3
import botocore.client
from boto3.s3.transfer import TransferConfig

def repro(endpoint: str) -> None:
    session = boto3.Session(aws_access_key_id='', aws_secret_access_key='')
    client = session.client(
        "s3",
        endpoint_url=endpoint,
        config=botocore.client.Config(
            max_pool_connections=1, retries=dict(mode='standard', max_attempts=3)
        ),
    )

    client.download_fileobj(
        Bucket='bucket',
        Key='bad',
        Fileobj=BytesIO(),
        Config=TransferConfig(use_threads=False),
    )

class Handler(http.server.BaseHTTPRequestHandler):
    def do_HEAD(self):
        self.send_response(HTTPStatus.OK)
        self.send_header('Content-Length', '10000')
        self.end_headers()

    def do_GET(self):
        self.send_response(HTTPStatus.OK)
        self.send_header('Content-Length', '10000')
        self.end_headers()
        # content length and body do not agree - simulating a connection drop
        self.wfile.write(b'\x00' * 1000)

if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)

    with socketserver.TCPServer(("localhost", 0), Handler) as httpd:
        threading.Thread(target=httpd.serve_forever, daemon=True).start()
        repro(f'http://localhost:{httpd.server_address[1]}')

and the following conda environment:

---
name: test
channels:
  - nodefaults
  - conda-forge
dependencies:
  - python 3.11
  - boto3 1.34.53
  - urllib3 2.0.7

Observe that only a single request is made and the following exception is raised:

botocore.exceptions.ResponseStreamingError: An error occurred while reading from response stream: ('Connection broken: IncompleteRead(1000 bytes read, 9000 more expected)', IncompleteRead(1000 bytes read, 9000 more expected))

and with the following conda environment:

---
name: test
channels:
  - nodefaults
  - conda-forge
dependencies:
  - python 3.11
  - boto3 1.34.53
  - urllib3 <2

Observe that multiple request attempts are made and the following exception is correctly raised:

s3transfer.exceptions.RetriesExceededError: Max Retries Exceeded

Possible Solution

No response

Additional Information/Context

No response

SDK version used

boto3 1.34.53

Environment details (OS name and version, etc.)

Python 3.11

nateprewitt commented 6 months ago

This should be available in the next release of s3transfer now that https://github.com/boto/s3transfer/pull/301 is merged. Thanks again, @apmorton!

github-actions[bot] commented 6 months ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.