box / box-python-sdk

Box SDK for Python
http://opensource.box.com/box-python-sdk/
Apache License 2.0
418 stars 215 forks source link

NewConnectionError [Errno 110] Connection timed out #739

Closed DJanyavula closed 1 year ago

DJanyavula commented 2 years ago

Description of the Issue

Some of my many upload to Box requests using python Box SDK are failing with error below:

Steps to Reproduce

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected Behavior

Error Message, Including Stack Trace

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 175, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 710, in urlopen
    chunked=chunked,
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 358, in connect
    self.sock = conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x40cf0fbbd0>: Failed to establish a new connection: [Errno 110] Connection timed out

Screenshots

Versions Used

Python SDK: Python:

mwwoda commented 2 years ago

Hi @DJanyavula Thanks for reaching out to us!

Can you provide a little more information about the problem so we can investigate it further?

Thanks

arifty commented 2 years ago

Hi @mwwoda ,

I will try to answer your questions, on behalf of my colleague @DJanyavula .

Packages versions used: Python 3.7 boxsdk 2.14.0

We were using this Box api calls for 11 files to be uploaded over a time span of 15 mins. It is not that many requests and we have not exceeded the api calls per second. If we have exceeded the limit, we yse to get the http error log of "exceeded api limit".

We have been using this function since few months now and for past 2 weeks, it is giving this time-out error always. When we use it for 2-3 file uploads to box, it works without issues.

You can replicate the scenario with below code. Just replace the box api token and the excel file folder names with actual path and file names.

```Sample code snippet below ````

"""This python module uploads the local files to given box folder""" import json import os from boxsdk import Client, JWTAuth

class BoxReader: """A reader class to connect to Box, read and write files from and to Box with local sources/targets or s3 sources/targets"""

def __init__(self, box_api_token: str = "/path/to/box_api_key.json"):
    """
    An interface for connecting to Box. The credentials will be stored and fetched from Cerberus
    """
    box_key = json.loads(box_api_token)
    private_key_path = "/tmp/private_key.pem"
    with open(private_key_path, "w") as private_key_file:
        private_key_file.write(box_key["boxAppSettings"]["appAuth"]["privateKey"])
        private_key_file.close()

    # Use the JWT credentials to create a box application client
    auth = JWTAuth(
        client_id=box_key["boxAppSettings"]["clientID"],
        client_secret=box_key["boxAppSettings"]["clientSecret"],
        enterprise_id=box_key["enterpriseID"],
        jwt_key_id=box_key["boxAppSettings"]["appAuth"]["publicKeyID"],
        rsa_private_key_file_sys_path=private_key_path,
        rsa_private_key_passphrase=str.encode(
            box_key["boxAppSettings"]["appAuth"]["passphrase"]
        ),
    )
    self.box_client = Client(auth)

def get_list_files(self, folder_id: int):
    """Get list of all files of a certain folder_id

    Args:
        folder_id (int): Folder ID for box for which files needs to be listed.
    """
    items = self.box_client.folder(folder_id=folder_id).get_items()
    list_files = [(item.id, item.name) for item in items if item.type == "file"]
    return list_files

def upload_file_to_box(self, temp_file, folder_id, file_name: str) -> str:
    """Uploads local file to specified box folder with specific file name

    Args:
        temp_file (NamedTemporaryFile): Local temp file with full path
        folder_id (int): Folder id in box to store the file
        file_name (str): specific file name

    Returns:
        str: name of the file path
    """
    input_file_path = os.path.join(os.path.dirname(temp_file.name), file_name)
    os.rename(temp_file.name, input_file_path)

    files_dict = dict(self.get_list_files(folder_id=folder_id))
    files_dict = {value: key for key, value in files_dict.items()}

    if file_name in files_dict.keys():
        self.box_client.file(files_dict[file_name]).delete()
    self.box_client.folder(folder_id).upload(input_file_path)
    return input_file_path

if name == "main": uploaded_files_path = BoxReader().upload_file_to_box( box_api_token="/path/to/box_api_key.json", temp_file="/local/path/to/temp_output_xxx.xlsx", folder_id=1111111, file_name="output_xxx.xlsx", ) print(f"uploaded_files_path: {uploaded_files_path}")

arifty commented 2 years ago

Hi @mwwoda ,

Is there any update on this ? Just fyi, we have upgraded the boxsdk version to 3.3.0 and now instead of timeout error, the api call gets hanging without finishing. It even kept running for 2 days and I had to cancel the execution.

Code: The box api call is made to the "client.folder.upload()" function in boxsdk package -> https://github.com/box/box-python-sdk/blob/fe00a9eb3434ee14bc4f01332d54c0272ed5f2d3/boxsdk/object/folder.py#L90:~:text=def-,upload,-(

Frequency: The frequency happens, when I am executing a sequential call of box api for PUT request. First 3-4 requests successds without any issue and after that, next call keeps hanging.

Impact: This is even more dangerous than giving a timeout error and this is blocking the application run completely. Could you please check why it's happening and does it happen for other users of yours as well ?

Thanks for considering this bit more urgently.

lukaszsocha2 commented 2 years ago

Hi @arifty, today I used you code to upload 100 files one by one and so far I was not able to replicate this issue you described. I'll continue investigating this tomorrow. @lukaszsocha2

lukaszsocha2 commented 2 years ago

Hi @arifty, today I also wasn't able to replicate this issue on my side, therefore I prepared a simple bash script with uploads all files from a local folder to box folder using curl command. If this script will block the same as using SDK, then we could exclude SDK code as the root of the problem and try to find it either in the network layer or the upload service.

path_to_folder_with_files="/Absolute/path/to/your/local/folder"
developer_token="..."
folder_id="box folder id"

for path in "$path_to_folder_with_files"/*
do
  filename="$(basename -- $path)"
  curl -i -X POST "https://upload.box.com/api/2.0/files/content" \
                               -H "Authorization: Bearer $developer_token" \
                               -H "Content-Type: multipart/form-data" \
                               -F attributes="{\"name\":\"$filename\", \"parent\":{\"id\":\"$folder_id\"}}" \
                               -F file=@"$path"
done

To make it work you should replace values of path_to_folder_with_files, developer_token and folder_id variables. Also can you tell me what it the size of the files you try to upload? Is there any pattern that the execution blocks on small or big files? Waiting for your answer to continue investigation. Best, @lukaszsocha2

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not been updated in the last 30 days. It will be closed if no further activity occurs within the next 7 days. Feel free to reach out or mention Box SDK team member for further help and resources if they are needed.

stale[bot] commented 1 year ago

This issue has been automatically closed due to maximum period of being stale. Thank you for your contribution to Box Python SDK and feel free to open another PR/issue at any time.