ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
46 stars 27 forks source link

S3TransferError 400 BadRequest with reason RequestTimeout #3531

Closed GitHK closed 11 months ago

GitHK commented 1 year ago

Analysing the log, it was observed that the upload was retried a few times, and there are also different RequestId entries. This happened for a period of ~8 minutes.

See traceback for details https://monitoring.osparc.io/graylog/dashboards/636a4c758211d89dbcb69f25

sanderegg commented 1 year ago

see issue frequency by issuing following: https://monitoring.osparc.io/graylog/search?q=%22Your+socket+connection+to+the+server+was+not+read+from+or+written+to+within+the+timeout+period%22&rangetype=relative&from=1209600

sanderegg commented 1 year ago

added more informative logs with https://github.com/ITISFoundation/osparc-simcore/pull/3717

sanderegg commented 1 year ago

https://monitoring.osparc.io/graylog/search?q=message%3A%22Your+socket+connection+to+the+server+was+not+read+from+or+written+to+within+the+timeout+period%22+AND+%22file_size%3D%22&rangetype=relative&from=2419200

This shows the latest happenning of that issue.

sanderegg commented 1 year ago

https://monitoring.osparc.io/graylog/search?q=message%3A%22Your+socket+connection+to+the+server+was+not+read+from+or+written+to+within+the+timeout+period%22+AND+%22file_size%3D%22&rangetype=relative&from=2419200

This shows the occurences.

sanderegg commented 1 year ago

After checking it can be said:

GitHK commented 1 year ago

@sanderegg I have no longer seen this issue pop up after merging #3737. Should we maybe close this?

sanderegg commented 1 year ago

@GitHK ok that is very good. feel free to close.

sanderegg commented 1 year ago

new occurences happened. This graylog calls show their occurences: https://monitoring.osparc.io/graylog/search?q=%22last_chunk_size%3D%22&rangetype=relative&from=2592000

sanderegg commented 1 year ago

Issues are still occuring, see https://monitoring.osparc.io/graylog/search?q=%22last_chunk_size%3D%22&rangetype=relative&from=2592000

mrnicegyu11 commented 1 year ago

Just as an update: This is still happening.

The issue is known to AWS and was reported at least on the git-repo of the aws java sdk. One of the good leads it seems is that the content-length might be slightly off [https://github.com/aws/aws-sdk-js/issues/281#issuecomment-194026499]. People claim that retrying the PUT call fixes this [https://github.com/aws/aws-sdk-js/issues/281#issuecomment-313800925], but in our case it doesn't.

Let me summarize the state of things from where I am sitting:

GitHK commented 11 months ago

Will close this one for now since https://github.com/ITISFoundation/osparc-simcore/pull/4996 was merged I have not seen any issues regarding this. Feel free to reopen it if it pops up again