facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.16k stars 627 forks source link

HTTP Error 403 when downloading some Atlas tarballs via aria2c #508

Closed naailkhan28 closed 1 year ago

naailkhan28 commented 1 year ago

Bug description When downloading tarballs from the Atlas binned by pLDDT and pTM, some of the tarballs give HTTP 403 (forbidden) response codes.

Reproduction steps aria2c --dir=atlas/tarballs https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.70_.80_00.tar.gz -l log1.txt

Logs Please paste the command line output:

03/23 11:42:41 [ERROR] CUID#7 - Download aborted. URI=https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.80_.90_01.tar.gz
Exception: [AbstractCommand.cc:351] errorCode=22 URI=https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.80_.90_01.tar.gz
  -> [HttpSkipResponseCommand.cc:239] errorCode=22 The response status is not successful. status=403

Here's the full output from the aria2c log:

2023-03-23 11:37:52.979074 [DEBUG] [AbstractCommand.cc:181] CUID#7 - socket: read:1, write:0, hup:0, err:0
2023-03-23 11:37:52.979083 [DEBUG] [AbstractCommand.cc:181] CUID#7 - socket: read:1, write:0, hup:0, err:0
2023-03-23 11:37:52.979104 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: SSL 3.3 Application Data packet received. Epoch 2, length: 22
2023-03-23 11:37:52.979110 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Expected Packet Application Data(23)
2023-03-23 11:37:52.979113 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Received Packet Application Data(23) with length: 22
2023-03-23 11:37:52.979142 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Decrypted Packet[2] Application Data(23) with length: 5
2023-03-23 11:37:52.979157 [INFO] [DownloadEngine.cc:315] Pool socket for 18.244.140.22(443)
2023-03-23 11:37:52.979249 [ERROR] [AbstractCommand.cc:349] CUID#7 - Download aborted. URI=https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.70_.80_00.tar.gz
Exception: [AbstractCommand.cc:351] errorCode=22 URI=https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.70_.80_00.tar.gz
  -> [HttpSkipResponseCommand.cc:239] errorCode=22 The response status is not successful. status=403
2023-03-23 11:37:52.979288 [DEBUG] [AbstractCommand.cc:479] CUID#7 - Aborting download
2023-03-23 11:37:52.979293 [DEBUG] [AbstractCommand.cc:423] CUID#7 - Not trying next request. No reserved/pooled request is remaining and total length is still unknown.
2023-03-23 11:37:52.979308 [DEBUG] [RequestGroup.cc:983] GID#239a57e74268b09b - Request queue check
2023-03-23 11:37:52.979318 [NOTICE] [RequestGroupMan.cc:424] Download GID#239a57e74268b09b not complete:
2023-03-23 11:37:52.979336 [DEBUG] [RequestGroup.cc:1173] GID#239a57e74268b09b - Creating DownloadResult.
2023-03-23 11:37:52.979348 [DEBUG] [RequestGroupMan.cc:481] 1 RequestGroup(s) deleted.
2023-03-23 11:37:52.979414 [DEBUG] [Platform.cc:87] GnuTLS: <3> ASSERT: ../../lib/buffers.c[_gnutls_io_write_flush]:696
2023-03-23 11:37:52.979423 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC: Sending Alert[1|0] - Close notify
2023-03-23 11:37:52.979432 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Preparing Packet Alert(21) with length: 2 and min pad: 0
2023-03-23 11:37:52.979466 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Sent Packet[2] Alert(21) in epoch 2 and length: 24
2023-03-23 11:37:52.979474 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Start of epoch cleanup
2023-03-23 11:37:52.979478 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: End of epoch cleanup
2023-03-23 11:37:52.979482 [DEBUG] [Platform.cc:87] GnuTLS: <5> REC[0x55cd5701e940]: Epoch #2 freed

So far I've received this error with three URLs:

https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.70_.80_00.tar.gz

https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.80_.90_00.tar.gz

https://dl.fbaipublicfiles.com/esmatlas/v2023_02/full/tarballs/tm_.90_1_plddt_.80_.90_01.tar.gz

However I've not checked any of the tarballs or foldseek DBs below pTM bin 0.7-0.8

w3ntinglu commented 1 year ago

Hi naailkhan28 , thanks for reporting this issue. We have fixed the s3 paths for the impacted tarballs and also validated all other paths are accessible - please try again and let us know if you still see the issue.

naailkhan28 commented 1 year ago

Just checked and these look good - thanks for fixing :)