box / boxcli

A command line interface for interacting with the Box API.
https://developer.box.com
Apache License 2.0
229 stars 59 forks source link

box folders:upload intermittently errors with ESOCKETTIMEDOUT on Node 14.19.3 on MacOS #343

Closed liukang18 closed 1 year ago

liukang18 commented 2 years ago

Description of the Issue

Trying to use the following boxcli command:

box folders:upload --bulk-file-path=/path/to/BOX_BULK_PATHS.csv --parent-folder=0000000 -v

And will intermittently receive the following error:

1 entries failed!

Entry 1 ( path=/Path/to/folder/file) failed with error: BoxCLIError: Could not upload file /Path/to/folder/file Caused by: Error: ESOCKETTIMEDOUT at ClientRequest. (/usr/local/lib/@box/cli/node_modules/request/request.js:816:19) at Object.onceWrapper (events.js:519:28) at ClientRequest.emit (events.js:400:28) at TLSSocket.emitRequestTimeout (_http_client.js:790:9) at Object.onceWrapper (events.js:519:28) at TLSSocket.emit (events.js:412:35) at TLSSocket.Socket._onTimeout (net.js:495:8) at listOnTimeout (internal/timers.js:557:17) at processTimers (internal/timers.js:500:7)

Note that it will fail on different parts of the folder, whether that is a .pdf, .png, .xlsx, etc, when running the script. Also, sometimes the folder will upload without any issue.

I am curious since I see from previous posts of this issue:

https://github.com/box/boxcli/issues/298

https://github.com/box/boxcli/issues/238

The issue is related to using Node 15+, with the suggestion to downgrade to Node 14. However according to the box --help (which was installed using the .pkg file on MacOS), the boxcli is using box 14.19.3, yet I am still receiving the bug?

Steps to Reproduce

running command: box folders:upload --bulk-file-path=/path/to/BOX_BULK_PATHS.csv --parent-folder=0000000 -v

Note that this error also intermittently occurs even when running using a single path:

box folders:upload /path/to/folder --parent-folder=0000000 -v

Expected Behavior

I expect the folder to successfully upload upon each calling box folders:upload, and all paths in the bulk.csv file to upload successfully.

Error Message, Including Stack Trace

1 entries failed!

Entry 1 ( path=/Path/to/folder/file) failed with error: BoxCLIError: Could not upload file /Path/to/folder/file Caused by: Error: ESOCKETTIMEDOUT at ClientRequest. (/usr/local/lib/@box/cli/node_modules/request/request.js:816:19) at Object.onceWrapper (events.js:519:28) at ClientRequest.emit (events.js:400:28) at TLSSocket.emitRequestTimeout (_http_client.js:790:9) at Object.onceWrapper (events.js:519:28) at TLSSocket.emit (events.js:412:35) at TLSSocket.Socket._onTimeout (net.js:495:8) at listOnTimeout (internal/timers.js:557:17) at processTimers (internal/timers.js:500:7)

Versions Used

Box CLI: @box/cli/3.1.0 darwin-x64 node-v14.19.3 (installed using the .pkg installation method) Operating System: MacOS Big Sur version 11.2.1

Any further information requested please let me know. Thank you.

arjankowski commented 2 years ago

Hi @liukang18,

Does this error still occur? We have some concerns that if this occurs intermittently it could be a backend/network issue. To check this, I have prepared a bash script that simply uploads all the files from the folder you have pointed to to the destination.

#!/bin/bash

DEVELOPMENT_TOKEN="<DEVELOPER_TOKEN>"                       # YOUR DEVELOPER TOKEN HERE 
SOURCE_FOLDER_PATH="<PATH_TO_FOLDER_WITH_FILES>"            # Path to folder ended with "/" e.g. /Users/user_name/folder_to_upload/
DESTINATION_FOLDER_ID="<FOLDER_ID"                          # The box ID folder when you want to upload files, e.g. "0"

for FILE_PATH in ${SOURCE_FOLDER_PATH}*; do
    FILE_NAME=$(basename $FILE_PATH)
    echo "Uploading file $FILE_NAME..."

    curl --location --request POST "https://upload.box.com/api/2.0/files/content" \
    --header "Content-Type: multipart/form-data" \
    --header "Authorization: Bearer ${DEVELOPMENT_TOKEN}" \
    --form "attributes={ \"name\": \"${FILE_NAME}\", \"parent\": {\"id\": \"${DESTINATION_FOLDER_ID}\" } }" \
    --form "file=@\"${FILE_PATH}\""
done

Please save it as e.g. upload.sh and give it execute permission chmod +x upload.sh. Then just set DEVELOPMENT_TOKEN, SOURCE_FOLDER_PATH and DESTINATION_FOLDER_ID and run it.

Let me know about the results when you ready.

Regard, Artur

liukang18 commented 2 years ago

Hello @arjankowski

Apologies for the delay in responding.

I used the upload.sh script you printed, and saved the log of a folder that contained ~2400 dicom imaging files as a bulk test.

I also ran the same upload using the 'box folders:upload' cli command.

I have attached both log files, in addition to the upload.sh file (with sensitive information removed).

Both demonstrated a time-out/upload failure.

I am using Oauth2 for my app. Should I switch to another authentication method, or is there a way to increase timeout?

Any further information needed let me know.

Thank you,

David

box_cli_folders_upload.1.log.txt box_curl_upload.1.log.txt upload_for_box_support.sh.txt

mwwoda commented 2 years ago

Hi @liukang18

I don't think the authentication method should have anything to do with the timeout of requests.

You can read how to increase the timeout for uploads here https://github.com/box/boxcli/blob/main/docs/configuration.md#configure-how-client-retries-calls-and-handles-timeouts

liukang18 commented 2 years ago

Hello @mwwoda,

Thank you so much for pointing me to the relevant doc.

If you don't mind, could I get your advice on optimal timeout setting, or how it is configured?

For instance, if I do a 'box folders:upload' call using the "--bulk-file-path" option, is the timeout duration for each row/call of the csv file, or for the entire csv file?

I also ask about optimal timeout, since we are dealing with hierarchical folder structures with thousands of moderate (i.e. 1-2 mb) sized files, so it takes (when the CLI upload command successfully runs using default settings) around 1.5-2 hours.

What would be the downside to setting timeout to something like 5 hours, or is there an upper limit to the timeout duration?

Thank you.

mwwoda commented 2 years ago

Timeout should be applied for each file upload separately.

The optimal value also depends on your environment. I think increasing it to a, let say something between 5-15 minutes should be fine if too short timeout is the problem in your case.

The disadvantage of increasing the timeout will be that you will wait longer to receive e.g a connection timeout error.

I'm not sure what the upper limit is, we still use the request node library so it depends if this library can handle such a long timeout value as several hours.

liukang18 commented 2 years ago

Hello @arjankowski and @mwwoda

So I think I am making progress. By taking @mwwoda suggestion and increasing timeout, I was able to successfully upload the 2400 dicom imaging files using the 'box folders:upload' command. Using the following options in the settings.json file

{ "numMaxRetries": 5, "retryIntervalMS": 3000, "uploadRequestTimeoutMS": 300000 }

However, when I try to use the command to upload a full/hierarchical folder path (with numerous files and sub-directories), it got to 2355/3430 files uploaded before throwing a 500 error:

BoxCLIError: Could not upload file /Volumes/Disk1/MINDSET_devops/DS_Env/BOX_CLI_TEST/CURRENT_TEST_07072022/TEST_FOLDERS/<CLIENT_NAME>/Analysis/dti_dir_001/IMG01694.dcm Caused by: Error: 500 - Internal Server Error at APIRequest._handleResponse (/usr/local/lib/@box/cli/node_modules/box-node-sdk/lib/api-request.js:161:19) at Request.self.callback (/usr/local/lib/@box/cli/node_modules/request/request.js:185:22) at Request.emit (events.js:400:28) at Request.<anonymous> (/usr/local/lib/@box/cli/node_modules/request/request.js:1161:10) at Request.emit (events.js:400:28) at IncomingMessage.<anonymous> (/usr/local/lib/@box/cli/node_modules/request/request.js:1083:12) at Object.onceWrapper (events.js:519:28) at IncomingMessage.emit (events.js:412:35) at endReadableNT (internal/streams/readable.js:1333:12) at processTicksAndRejections (internal/process/task_queues.js:82:21)

Is there a limitation that Box has for large upload requests to reduce load on the servers? Is there some way that I could delay/offset the upload request so that it does not reach this limit (if one is present)?

I have also attached the log file of the upload command.

Thank you.
box_cli_full_folders_upload.1.log.txt

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not been updated in the last 30 days. It will be closed if no further activity occurs within the next 7 days. Feel free to reach out or mention Box SDK team member for further help and resources if they are needed.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not been updated in the last 30 days. It will be closed if no further activity occurs within the next 7 days. Feel free to reach out or mention Box SDK team member for further help and resources if they are needed.

stale[bot] commented 1 year ago

This issue has been automatically closed due to maximum period of being stale. Thank you for your contribution to Box CLI and feel free to open another PR/issue at any time.