GoogleCloudPlatform / gsutil

A command line tool for interacting with cloud storage services.
Apache License 2.0
864 stars 331 forks source link

Timeout during download not clear why INFO {TIMESTAMP} retry_util.py] Retrying request, attempt #X... #1764

Closed covalspace closed 3 months ago

covalspace commented 3 months ago
(base) me@stuff Desktop % gsutil -D cp gs://public-datasets-deepmind-alphafold-v4/proteomes/proteome-tax_id-1837183-0_v4.tar ./proteome-tax_id-1837183-0_v4.tar

***************************** WARNING *****************************
*** You are running gsutil with debug output enabled.
*** Be aware that debug output includes authentication credentials.
*** Make sure to remove the value of the Authorization header for
*** each HTTP request printed to the console prior to posting to
*** a public medium such as a forum post or Stack Overflow.
***************************** WARNING *****************************
gsutil version: 5.27
checksum: 5cf9fcad0f47bc86542d009bbe69f297 (OK)
boto version: 2.49.0
python version: 3.11.3 (v3.11.3:f3909b8bc8, Apr  4 2023, 20:12:10) [Clang 13.0.0 (clang-1300.0.29.30)]
OS: Darwin 22.4.0
multiprocessing available: True
using cloud sdk: True
pass cloud sdk credentials to gsutil: True
config path(s): /Users/me/.boto, /Users/me/.config/gcloud/legacy_credentials/covalspace@gmail.com/.boto
gsutil path: /Users/me/google-cloud-sdk/bin/gsutil
compiled crcmod: True
installed via package manager: False
editable install: False
shim enabled: False
Command being run: /Users/me/google-cloud-sdk/platform/gsutil/gsutil -o GSUtil:default_project_id=sunny-emissary-417123 -D cp gs://public-datasets-deepmind-alphafold-v4/proteomes/proteome-tax_id-1837183-0_v4.tar ./proteome-tax_id-1837183-0_v4.tar
config_file_list: ['/Users/me/.boto', '/Users/me/.config/gcloud/legacy_credentials/covalspace@gmail.com/.boto']
config: [('working_dir', '/mnt/pyami'), ('debug', '0'), ('https_validate_certificates', 'True'), ('working_dir', '/mnt/pyami'), ('debug', '0'), ('content_language', 'en'), ('default_api_version', '2'), ('default_project_id', 'sunny-emissary-417123')]
DEBUG 0420 20:06:14.068563 multiprocess_file_storage.py] Read credential file
INFO 0420 20:06:14.069764 base_api.py] Calling method storage.objects.get with StorageObjectsGetRequest: <StorageObjectsGetRequest
 bucket: 'public-datasets-deepmind-alphafold-v4'
 object: 'proteomes/proteome-tax_id-1837183-0_v4.tar'
 projection: ProjectionValueValuesEnum(noAcl, 1)>
INFO 0420 20:06:14.070171 base_api.py] Making http GET to https://storage.googleapis.com/storage/v1/b/public-datasets-deepmind-alphafold-v4/o/proteomes%2Fproteome-tax_id-1837183-0_v4.tar?alt=json&fields=md5Hash%2Cgeneration%2CmediaLink%2CcontentEncoding%2CcustomerEncryption%2Cetag%2CcontentType%2Ccrc32c%2Csize%2Cname&projection=noAcl
INFO 0420 20:06:14.070345 base_api.py] Headers: {'accept': 'application/json',
 'accept-encoding': 'gzip, deflate',
 'content-length': '0',
 'user-agent': 'apitools Python/3.11.3 gsutil/5.27 (darwin) analytics/enabled '
               'interactive/True command/cp google-cloud-sdk/470.0.0'}
INFO 0420 20:06:14.070921 base_api.py] Body: (none)
INFO 0420 20:07:14.117877 retry_util.py] Retrying request, attempt #1...
DEBUG 0420 20:07:14.130867 http_wrapper.py] Caught socket error, retrying: timed out
DEBUG 0420 20:07:14.140284 http_wrapper.py] Retrying request to url https://storage.googleapis.com/storage/v1/b/public-datasets-deepmind-alphafold-v4/o/proteomes%2Fproteome-tax_id-1837183-0_v4.tar?alt=json&fields=md5Hash%2Cgeneration%2CmediaLink%2CcontentEncoding%2CcustomerEncryption%2Cetag%2CcontentType%2Ccrc32c%2Csize%2Cname&projection=noAcl after exception timed out
connect: (storage.googleapis.com, 443)
send: b'GET /storage/v1/b/public-datasets-deepmind-alphafold-v4/o/proteomes%2Fproteome-tax_id-1837183-0_v4.tar?alt=json&fields=md5Hash%2Cgeneration%2CmediaLink%2CcontentEncoding%2CcustomerEncryption%2Cetag%2CcontentType%2Ccrc32c%2Csize%2Cname&projection=noAcl HTTP/1.1\r\nHost: storage.googleapis.com\r\ncontent-length: 0\r\nuser-agent: apitools Python/3.11.3 gsutil/5.27 (darwin) analytics/enabled interactive/True command/cp google-cloud-sdk/470.0.0\r\naccept: application/json\r\naccept-encoding: gzip, deflate\r\nauthorization: Bearer ya29.a0Ad52N38i05ILJYBOJLaioNQwODn3R_qminxOz1k4XWMwKWcCNtM4SSDGLe_I0q1EphwZldiYZdhN5BD_NxiXmq_l4BeLA6L81A0clvBJk-ihqDG_LCzwDZGPAMmFbYVn3mpX1WkGIOLQY0ozLCZ16D7TD4pus9BHAVjg_FrRzSJRaCgYKASASARESFQHGX2Mihj6KHlFY_-Tix6L51ti6nw0179\r\n\r\n'
Zero length chunk ignored
reply: 'HTTP/1.1 200 OK\r\n'
header: X-GUploader-UploadID: ABPtcPqZobA6YxAEYbdzyE4ZaZr6Ji5tBoC-OY1YDu1hjcJ6Z2xIC0x6KThND3zLrAGLYKvAajd1nnrpvw
header: ETag: CMfIm8G1jPsCEAE=
header: Content-Type: application/json; charset=UTF-8
header: Date: Sun, 21 Apr 2024 01:07:15 GMT
header: Vary: Origin
header: Vary: X-Origin
header: Cache-Control: private, max-age=0, must-revalidate, no-transform
header: Expires: Sun, 21 Apr 2024 01:07:15 GMT
header: Content-Length: 449
header: Server: UploadServer
header: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
INFO 0420 20:07:15.952117 base_api.py] Response of type Object: <Object
 acl: []
 contentType: 'application/octet-stream'
 crc32c: 'PXqh5g=='
 etag: 'CMfIm8G1jPsCEAE='
 generation: 1667286308414535
 md5Hash: 'gZhXJHKNpiiZ3iV2n0YtMQ=='
 mediaLink: 'https://storage.googleapis.com/download/storage/v1/b/public-datasets-deepmind-alphafold-v4/o/proteomes%2Fproteome-tax_id-1837183-0_v4.tar?generation=1667286308414535&alt=media'
 name: 'proteomes/proteome-tax_id-1837183-0_v4.tar'
 size: 161280>
process count: 1
thread count: 1
process count: 1
thread count: 1
Copying gs://public-datasets-deepmind-alphafold-v4/proteomes/proteome-tax_id-1837183-0_v4.tar...
/ [0 files][    0.0 B/157.5 KiB]                                - [0 files][    0.0 B/157.5 KiB]                                Retrying download from byte 0 after exception: b'timed out'. Trace: Traceback (most recent call last):
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1284, in _PerformResumableDownload
    return self._PerformDownload(bucket_name,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1383, in _PerformDownload
    apitools_download.GetRange(additional_headers=additional_headers,
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 485, in GetRange
    response = self.__GetChunk(progress, end_byte,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 418, in __GetChunk
    return http_wrapper.MakeRequest(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/http_wrapper.py", line 359, in MakeRequest
    retry_func(ExceptionRetryArgs(http, http_request, e, retry,
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/utils/retry_util.py", line 84, in RetriesInDataTransferHandler
    http_wrapper.RethrowExceptionHandler(retry_args)
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/http_wrapper.py", line 348, in MakeRequest
    return _MakeRequestNoRetry(
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/http_wrapper.py", line 397, in _MakeRequestNoRetry
    info, content = http.request(
                    ^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/gcs_json_media.py", line 544, in NewRequest
    return request_orig(uri, method=method, body=body,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/vendored/oauth2client/oauth2client/transport.py", line 173, in new_request
    resp, content = request(orig_request_method, uri, method, body,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/vendored/oauth2client/oauth2client/transport.py", line 280, in request
    return http_callable(uri, method=method, body=body, headers=headers,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/httplib2/python3/httplib2/__init__.py", line 1701, in request
    (response, content) = self._request(
                          ^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/gcs_json_media.py", line 452, in OverrideRequest
    (response, content) = self._conn_request(conn, request_uri, method, body,
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/me/google-cloud-sdk/platform/gsutil/gslib/gcs_json_media.py", line 643, in _conn_request
    conn.connect()
  File "/Users/me/google-cloud-sdk/platform/gsutil/third_party/httplib2/python3/httplib2/__init__.py", line 1133, in connect
    sock.connect((self.host, self.port))
TimeoutError: timed out

connect: (storage.googleapis.com, 443)
send: b'GET /download/storage/v1/b/public-datasets-deepmind-alphafold-v4/o/proteomes%2Fproteome-tax_id-1837183-0_v4.tar?generation=1667286308414535&alt=media HTTP/1.1\r\nHost: storage.googleapis.com\r\nAccept-Encoding: identity\r\ncontent-length: 0\r\nrange: bytes=0-161279\r\nuser-agent: apitools Python/3.11.3 gsutil/5.27 (darwin) analytics/enabled interactive/True command/cp google-cloud-sdk/470.0.0\r\nauthorization: Bearer ya29.a0Ad52N38i05ILJYBOJLaioNQwODn3R_qminxOz1k4XWMwKWcCNtM4SSDGLe_I0q1EphwZldiYZdhN5BD_NxiXmq_l4BeLA6L81A0clvBJk-ihqDG_LCzwDZGPAMmFbYVn3mpX1WkGIOLQY0ozLCZ16D7TD4pus9BHAVjg_FrRzSJRaCgYKASASARESFQHGX2Mihj6KHlFY_-Tix6L51ti6nw0179\r\n\r\n'
Zero length chunk ignored
reply: 'HTTP/1.1 206 Partial Content\r\n'
header: X-GUploader-UploadID: ABPtcPrmWvOXaVEKcT7mR17VBIMS0fx2YOXRVVu-YMHKrZwcnYxr7A8naMhJmHQoSf1SnvT_f9s
header: Content-Range: bytes 0-161279/161280
header: ETag: CMfIm8G1jPsCEAE=
header: Content-Type: application/octet-stream
header: X-Goog-Stored-Content-Encoding: identity
header: X-Goog-Stored-Content-Length: 161280
header: X-Goog-Generation: 1667286308414535
header: X-Goog-Metageneration: 1
header: X-Goog-Storage-Class: STANDARD
header: Last-Modified: Tue, 01 Nov 2022 07:05:08 GMT
header: Pragma: no-cache
header: Expires: Mon, 01 Jan 1990 00:00:00 GMT
header: Cache-Control: no-cache, no-store, max-age=0, must-revalidate
header: Content-Disposition: attachment
header: Date: Sun, 21 Apr 2024 01:08:18 GMT
header: Vary: Origin
header: Vary: X-Origin
header: Content-Length: 161280
header: Server: UploadServer
header: Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
\ [0 files][157.5 KiB/157.5 KiB]   64.8 KiB/s                   Comparing local vs cloud md5-checksum for ./proteome-tax_id-1837183-0_v4.tar. (b'gZhXJHKNpiiZ3iV2n0YtMQ=='/b'gZhXJHKNpiiZ3iV2n0YtMQ==')
| [1 files][157.5 KiB/157.5 KiB]   64.8 KiB/s                                   
Operation completed over 1 objects/157.5 KiB.                                    
total_bytes_transferred: 161280
Total bytes copied=161280, total elapsed time=62.506 secs (2.52 KiBps)
DEBUG 0420 20:08:18.473183 metrics.py] [Metric(endpoint='https://ssl.google-analytics.com/collect', method='POST', body='cd1=cp&cd10=1713661698472&cd2=D%2Co&cd4=content_language%3Aen%2Cdebug%3A0%2Cdefault_api_version%3A2%2Chttps_validate_certificates%3ATrue&cd5=cp&cid=a92d376c9ab647f2903f00f1937f63d4&cm1=124788&cm2=2&cm3=0&ea=cp&ec=Command&el=5.27&ev=0&t=event&tid=UA-36037335-16&v=1', user_agent='Darwin/22.4.0'), Metric(endpoint='https://ssl.google-analytics.com/collect', method='POST', body='cd1=cp&cd10=1713661698472&cd2=D%2Co&cd4=content_language%3Aen%2Cdebug%3A0%2Cdefault_api_version%3A2%2Chttps_validate_certificates%3ATrue&cd5=cp&cid=a92d376c9ab647f2903f00f1937f63d4&cm1=124788&cm2=2&cm3=0&ea=TimeoutError&ec=RetryableError&el=5.27&ev=0&t=event&tid=UA-36037335-16&v=1', user_agent='Darwin/22.4.0'), Metric(endpoint='https://ssl.google-analytics.com/collect', method='POST', body='cd1=cp&cd10=1713661698472&cd2=D%2Co&cd4=content_language%3Aen%2Cdebug%3A0%2Cdefault_api_version%3A2%2Chttps_validate_certificates%3ATrue&cd5=cp&cd7=none&cd8=cloud&cd9=file%2Cgs&cid=a92d376c9ab647f2903f00f1937f63d4&cm1=62505&cm10=2&cm12=2583.0413438911964&cm13=2583.0413438911964&cm3=0&cm4=1&cm5=1&cm6=1&cm7=161280&cm8=2580.2365870254625&cm9=0&ea=CloudToFile&ec=PerformanceSummary&el=5.27&ev=0&t=event&tid=UA-36037335-16&v=1', user_agent='Darwin/22.4.0')]
DEBUG 0420 20:08:18.476758 metrics.py] Metrics reporting process started...
covalspace commented 3 months ago

I believe this was a memory issue. I killed a VM and the issue went away like magic.