boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
9.05k stars 1.87k forks source link

use epoll instead of select #1053

Closed liuqiang1999 closed 4 years ago

liuqiang1999 commented 7 years ago

hi, I get errors "ValueError: filedescriptor out of range in select()" when uploading files in my application. Looks like it's because boto3 uses select() instead of epoll for socket. Can this be changed? Thanks

stealthycoin commented 7 years ago

What command are you running and can you provide the output of adding --debug to it?

liuqiang1999 commented 7 years ago

Hi I don't know how to add --debug, but here is the exception info:

I looked at the log again, maybe I should ask the question in "botocore" project? Thanks

Traceback (most recent call last): File "{mycode}.py", line 100, in func _boto3.client('s3').uploadfile(Filename=file, Bucket=bucket, Key=file) File "/usr/local/lib/python3.5/site-packages/boto3/s3/inject.py", line 106, in upload_file extra_args=ExtraArgs, callback=Callback) File "/usr/local/lib/python3.5/site-packages/boto3/s3/transfer.py", line 275, in upload_file future.result() File "/usr/local/lib/python3.5/site-packages/s3transfer/futures.py", line 73, in result return self._coordinator.result() File "/usr/local/lib/python3.5/site-packages/s3transfer/futures.py", line 233, in result raise self._exception File "/usr/local/lib/python3.5/site-packages/s3transfer/tasks.py", line 126, in call return self._execute_main(kwargs) File "/usr/local/lib/python3.5/site-packages/s3transfer/tasks.py", line 150, in _execute_main return_value = self._main(kwargs) File "/usr/local/lib/python3.5/site-packages/s3transfer/upload.py", line 679, in _main client.put_object(Bucket=bucket, Key=key, Body=body, extra_args) File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 253, in _api_call return self._make_api_call(operation_name, kwargs) File "/usr/local/lib/python3.5/site-packages/botocore/client.py", line 531, in _make_api_call operation_model, request_dict) File "/usr/local/lib/python3.5/site-packages/botocore/endpoint.py", line 141, in make_request return self._send_request(request_dict, operation_model) File "/usr/local/lib/python3.5/site-packages/botocore/endpoint.py", line 170, in _send_request success_response, exception): File "/usr/local/lib/python3.5/site-packages/botocore/endpoint.py", line 249, in _needs_retry caught_exception=caught_exception, request_dict=request_dict) File "/usr/local/lib/python3.5/site-packages/botocore/hooks.py", line 227, in emit return self._emit(event_name, kwargs) File "/usr/local/lib/python3.5/site-packages/botocore/hooks.py", line 210, in _emit response = handler(kwargs) File "/usr/local/lib/python3.5/site-packages/botocore/retryhandler.py", line 183, in call if self._checker(attempts, response, caught_exception): File "/usr/local/lib/python3.5/site-packages/botocore/retryhandler.py", line 251, in call caught_exception) File "/usr/local/lib/python3.5/site-packages/botocore/retryhandler.py", line 269, in _should_retry return self._checker(attempt_number, response, caught_exception) File "/usr/local/lib/python3.5/site-packages/botocore/retryhandler.py", line 317, in call caught_exception) File "/usr/local/lib/python3.5/site-packages/botocore/retryhandler.py", line 223, in call attempt_number, caught_exception) File "/usr/local/lib/python3.5/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception raise caught_exception File "/usr/local/lib/python3.5/site-packages/botocore/endpoint.py", line 204, in _get_response proxies=self.proxies, timeout=self.timeout) File "/usr/local/lib/python3.5/site-packages/botocore/vendored/requests/sessions.py", line 573, in send r = adapter.send(request, kwargs) File "/usr/local/lib/python3.5/site-packages/botocore/vendored/requests/adapters.py", line 370, in send timeout=timeout File "/usr/local/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen body=body, headers=headers) File "/usr/local/lib/python3.5/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 349, in _make_request conn.request(method, url, *httplib_request_kw) File "/usr/lib64/python3.5/http/client.py", line 1083, in request self._send_request(method, url, body, headers) File "/usr/local/lib/python3.5/site-packages/botocore/awsrequest.py", line 129, in _send_request self, method, url, body, headers, args, kwargs) File "/usr/lib64/python3.5/http/client.py", line 1128, in _send_request self.endheaders(body) File "/usr/lib64/python3.5/http/client.py", line 1079, in endheaders self._send_output(message_body) File "/usr/local/lib/python3.5/site-packages/botocore/awsrequest.py", line 162, in _send_output read, write, exc = select.select**([self.sock], [], [self.sock], 1) ValueError: filedescriptor out of range in select()

stablerg commented 7 years ago

Has there been any updates on this? I did not see an issue in botocore for it. I am running into the issue in my program after it runs for several hours and I haven't found a workaround for it.

My client code is running the following inside of its own thread:

session = boto3.session.Session(aws_access_key_id=AWS_ACCESS_KEY_ID,
                                aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
s3 = session.client('s3')
s3_path = local_path.replace(LOCAL_LOGS_DIR, 'data')
s3.upload_file(local_path, S3_BUCKET, s3_path)

I'm running inside the Python 3.6.1 Docker container with the following library versions: boto3==1.4.4 botocore==1.5.68 requests==2.18.1 s3transfer==0.1.10 urllib3==1.21.1

Here's the stacktrace for my version:

File "./data_collector/extractor.py", line 483, in upload_file_to_s3
    s3_transfer.upload_file(local_path, settings.S3_BUCKET, s3_path)
  File "/usr/local/lib/python3.6/site-packages/boto3/s3/transfer.py", line 275, in upload_file
    future.result()
  File "/usr/local/lib/python3.6/site-packages/s3transfer/futures.py", line 73, in result
    return self._coordinator.result()
  File "/usr/local/lib/python3.6/site-packages/s3transfer/futures.py", line 233, in result
    raise self._exception
  File "/usr/local/lib/python3.6/site-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "/usr/local/lib/python3.6/site-packages/s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/s3transfer/upload.py", line 679, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 253, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 544, in _make_api_call
    operation_model, request_dict)
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 141, in make_request
    return self._send_request(request_dict, operation_model)
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 170, in _send_request
    success_response, exception):
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 249, in _needs_retry
    caught_exception=caught_exception, request_dict=request_dict)
  File "/usr/local/lib/python3.6/site-packages/botocore/hooks.py", line 227, in emit
    return self._emit(event_name, kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/hooks.py", line 210, in _emit
    response = handler(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/retryhandler.py", line 183, in __call__
    if self._checker(attempts, response, caught_exception):
  File "/usr/local/lib/python3.6/site-packages/botocore/retryhandler.py", line 251, in __call__
    caught_exception)
  File "/usr/local/lib/python3.6/site-packages/botocore/retryhandler.py", line 269, in _should_retry
    return self._checker(attempt_number, response, caught_exception)
  File "/usr/local/lib/python3.6/site-packages/botocore/retryhandler.py", line 317, in __call__
    caught_exception)
  File "/usr/local/lib/python3.6/site-packages/botocore/retryhandler.py", line 223, in __call__
    attempt_number, caught_exception)
  File "/usr/local/lib/python3.6/site-packages/botocore/retryhandler.py", line 359, in _check_caught_exception
    raise caught_exception
  File "/usr/local/lib/python3.6/site-packages/botocore/endpoint.py", line 204, in _get_response
    proxies=self.proxies, timeout=self.timeout)
  File "/usr/local/lib/python3.6/site-packages/botocore/vendored/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/botocore/vendored/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/usr/local/lib/python3.6/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/usr/local/lib/python3.6/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 349, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.6/http/client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.6/site-packages/botocore/awsrequest.py", line 130, in _send_request
    self, method, url, body, headers, *args, **kwargs)
  File "/usr/local/lib/python3.6/http/client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/http/client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.6/site-packages/botocore/awsrequest.py", line 163, in _send_output
    read, write, exc = select.select([self.sock], [], [self.sock], 1)
ValueError: filedescriptor out of range in select()
sethmlarson commented 7 years ago

May I suggest vendoring/using selectors2 or an equivalent library?

(Disclosure: I'm the author of selectors2)

liuqiang1999 commented 7 years ago

No I don't have a solution. I agree the fix should be in botocore, to change the library of select. I just simply reduced my file descriptor usage, and avoid it.

stablerg commented 7 years ago

While digging into it more, I discovered my program was leaking threads, which led to the high file descriptor count and hitting the error. After fixing the thread leak, the problem went away.

joguSD commented 7 years ago

Looks like this is an issue in the urllib3 library. Was fixed in this pr: https://github.com/shazow/urllib3/pull/1001

zxwx commented 7 years ago

overselect.py:

import boto3

files = [open("/tmp/{}".format(i), 'w') for i in range(1024)]
boto3.client('s3').upload_file('/tmp/hello.txt', 'mybucket', 'hello.txt')
$ python overselect.py
Traceback (most recent call last):
  File "select.py", line 3, in <module>
    boto3.client('s3').upload_file('/tmp/hello.txt', 'mybucket', 'hello.txt')
  File "/usr/local/lib/python2.7/dist-packages/boto3/s3/inject.py", line 106, in upload_file
    extra_args=ExtraArgs, callback=Callback)
  File "/usr/local/lib/python2.7/dist-packages/boto3/s3/transfer.py", line 275, in upload_file
    future.result()
  File "/usr/local/lib/python2.7/dist-packages/s3transfer/futures.py", line 73, in result
    return self._coordinator.result()
  File "/usr/local/lib/python2.7/dist-packages/s3transfer/futures.py", line 233, in result
    raise self._exception
ValueError: filedescriptor out of range in select()

Looks like select is imported both in the vendored version of urllib3 and in botocore/awsrequest.py

DamZiobro commented 7 years ago

Linux allows to open 1024 files per process at the same time (http://docs.oracle.com/cd/E19450-01/820-6168/file-descriptor-requirements.html). Therefore the above overselect.py does not work. Is this 'boto' task to omitting OS limitations?

zxwx commented 7 years ago

@JordonPhillips @kyleknap Any updates?

@xmementoit You are mistaken; linux does not have a fixed limitation on the number of open files per process. In fact, the link you shared explains how to adjust some of the relevant system limits.

lqs commented 6 years ago

Python's select.select() throws an exception if any fd value is larger than FD_SETSIZE.

To avoid the select() call in awsrequest.py, add the workaround to skip it:

from botocore import handlers
handlers.BUILTIN_HANDLERS = filter(lambda el: el[1] is not handlers.add_expect_header,
                                   handlers.BUILTIN_HANDLERS)
tuulos commented 6 years ago

any updates? We hit this issue too

swetashre commented 4 years ago

Following up on this issue. Considering it's a old issue, is anyone still getting this error with latest version of sdk ? If anyone is still getting the error please reopen a new issue.

no-response[bot] commented 4 years ago

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.