ncbi / sra-tools

SRA Tools
Other
1.14k stars 248 forks source link

prefetch and fasterq-dump fail on a Google virtual machine #588

Closed eitanyaffe closed 2 years ago

eitanyaffe commented 2 years ago

Hi,

I'm trying to use the SRA Toolkit to download data onto a Google Cloud VM. I created a strong VM (n2-standard-32) and I installed the toolkit with version 2.11.2.

I get the following error:

sa_113045741138414598170@makeshift-std32-1:~$ prefetch -v SRR000001

2022-02-10T06:15:54 prefetch.2.11.2: Current preference is set to retrieve SRA Normalized Format files with full base quality scores.
2022-02-10T06:15:54 prefetch.2.11.2: 1) Downloading 'SRR000001'...
2022-02-10T06:15:54 prefetch.2.11.2: SRA Normalized Format file is being retrieved, if this is different from your preference, it may be due to current file availability.
2022-02-10T06:15:54 prefetch.2.11.2:  Downloading via HTTPS...
2022-02-10T06:15:56 prefetch.2.11.2 err: data unexpected while reading file within network system module - Cannot KStblHttpFileRead
2022-02-10T06:15:56 prefetch.2.11.2 err: data unexpected while reading file within network system module - Cannot KFileRead: retrying 'https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbHJraWQxIiwidHlwIjoiSldUIn0.eyJhY2MiOiJTUlIwMDAwMDEiLCJleHAiOjE2NDQ0ODgxNTQsImZpbGVTaXplIjoiMzEyNTI3MDgzIiwiaWF0IjoxNjQ0NDczNzU0LCJpZ25vcmVDZSI6InRydWUiLCJqdGkiOiJjODRjNDZmNC05MzFjLTQyNmQtODk0Yi0wMjAwMDgyMjg2MGMiLCJsaW5rIjoiaHR0cHM6Ly9zdG9yYWdlLmdvb2dsZWFwaXMuY29tL3NyYS1wdWItcnVuLTcvU1JSMDAwMDAxL1NSUjAwMDAwMS40P25jYmlfcGhpZD0zMjJDNTI0NzBCNjYyMEM1MDAwMDQyMDQ2RDE4NUU4MS4xLjEiLCJyZWdpb24iOiJ1cyIsInNlcnZpY2UiOiJncyIsInNpZ25pbmdBY2NvdW50Ijoic3JhX2dzIiwidGltZW91dCI6MTQ0MDB9.j0bReTs1dJtyQyYz0CEqb5Mv-K3qBbVXVSw4k8ZZs25LhFEymeXzCsJSINpmB3WQPM_CpIO9GkJuAvgvtickz0tOwyeHZvnroCU6mYdUP578lpDjDANQUNis653YNflsltbcV9MNsRG-Jjc3gB62V1_cpK99jVRPnJkPmwG0d8s2aJ-QFAiIPFxff_ikuBqc9rBHFfCnyswIa3WtTBmI1HDfed9y-7G9jG32-eBNnCFFvn69HmvaFAMmkT2i5o4Fpod4ZyeG0abf1qxqi9vzikxRxoU5fXXt6xkZhA2mfoHXGYefTznAmp6whncHGVVRLiSecIc8v0JNdr2g1e8qtw'...
2022-02-10T06:15:56 prefetch.2.11.2 err: data unexpected while reading file within network system module - Cannot KStblHttpFileRead
2022-02-10T06:15:56 prefetch.2.11.2 err: data unexpected while reading file within network system module - Cannot KFileRead: retrying 'https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbHJraWQxIiwidHlwIjoiSldUIn0.eyJhY2MiOiJTUlIwMDAwMDEiLCJleHAiOjE2NDQ0ODgxNTQsImZpbGVTaXplIjoiMzEyNTI3MDgzIiwiaWF0IjoxNjQ0NDczNzU0LCJpZ25vcmVDZSI6InRydWUiLCJqdGkiOiJjODRjNDZmNC05MzFjLTQyNmQtODk0Yi0wMjAwMDgyMjg2MGMiLCJsaW5rIjoiaHR0cHM6Ly9zdG9yYWdlLmdvb2dsZWFwaXMuY29tL3NyYS1wdWItcnVuLTcvU1JSMDAwMDAxL1NSUjAwMDAwMS40P25jYmlfcGhpZD0zMjJDNTI0NzBCNjYyMEM1MDAwMDQyMDQ2RDE4NUU4MS4xLjEiLCJyZWdpb24iOiJ1cyIsInNlcnZpY2UiOiJncyIsInNpZ25pbmdBY2NvdW50Ijoic3JhX2dzIiwidGltZW91dCI6MTQ0MDB9.j0bReTs1dJtyQyYz0CEqb5Mv-K3qBbVXVSw4k8ZZs25LhFEymeXzCsJSINpmB3WQPM_CpIO9GkJuAvgvtickz0tOwyeHZvnroCU6mYdUP578lpDjDANQUNis653YNflsltbcV9MNsRG-Jjc3gB62V1_cpK99jVRPnJkPmwG0d8s2aJ-QFAiIPFxff_ikuBqc9rBHFfCnyswIa3WtTBmI1HDfed9y-7G9jG32-eBNnCFFvn69HmvaFAMmkT2i5o4Fpod4ZyeG0abf1qxqi9vzikxRxoU5fXXt6xkZhA2mfoHXGYefTznAmp6whncHGVVRLiSecIc8v0JNdr2g1e8qtw'...
2022-02-10T06:15:56 prefetch.2.11.2 err: file no permission while reading file within file system module - Cannot KFileRead: retrying 'https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbHJraWQxIiwidHlwIjoiSldUIn0.eyJhY2MiOiJTUlIwMDAwMDEiLCJleHAiOjE2NDQ0ODgxNTQsImZpbGVTaXplIjoiMzEyNTI3MDgzIiwiaWF0IjoxNjQ0NDczNzU0LCJpZ25vcmVDZSI6InRydWUiLCJqdGkiOiJjODRjNDZmNC05MzFjLTQyNmQtODk0Yi0wMjAwMDgyMjg2MGMiLCJsaW5rIjoiaHR0cHM6Ly9zdG9yYWdlLmdvb2dsZWFwaXMuY29tL3NyYS1wdWItcnVuLTcvU1JSMDAwMDAxL1NSUjAwMDAwMS40P25jYmlfcGhpZD0zMjJDNTI0NzBCNjYyMEM1MDAwMDQyMDQ2RDE4NUU4MS4xLjEiLCJyZWdpb24iOiJ1cyIsInNlcnZpY2UiOiJncyIsInNpZ25pbmdBY2NvdW50Ijoic3JhX2dzIiwidGltZW91dCI6MTQ0MDB9.j0bReTs1dJtyQyYz0CEqb5Mv-K3qBbVXVSw4k8ZZs25LhFEymeXzCsJSINpmB3WQPM_CpIO9GkJuAvgvtickz0tOwyeHZvnroCU6mYdUP578lpDjDANQUNis653YNflsltbcV9MNsRG-Jjc3gB62V1_cpK99jVRPnJkPmwG0d8s2aJ-QFAiIPFxff_ikuBqc9rBHFfCnyswIa3WtTBmI1HDfed9y-7G9jG32-eBNnCFFvn69HmvaFAMmkT2i5o4Fpod4ZyeG0abf1qxqi9vzikxRxoU5fXXt6xkZhA2mfoHXGYefTznAmp6whncHGVVRLiSecIc8v0JNdr2g1e8qtw'...
2022-02-10T06:15:56 prefetch.2.11.2 err: file no permission while reading file within file system module - Cannot KFileRead: retrying 'https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbHJraWQxIiwidHlwIjoiSldUIn0.eyJhY2MiOiJTUlIwMDAwMDEiLCJleHAiOjE2NDQ0ODgxNTQsImZpbGVTaXplIjoiMzEyNTI3MDgzIiwiaWF0IjoxNjQ0NDczNzU0LCJpZ25vcmVDZSI6InRydWUiLCJqdGkiOiJjODRjNDZmNC05MzFjLTQyNmQtODk0Yi0wMjAwMDgyMjg2MGMiLCJsaW5rIjoiaHR0cHM6Ly9zdG9yYWdlLmdvb2dsZWFwaXMuY29tL3NyYS1wdWItcnVuLTcvU1JSMDAwMDAxL1NSUjAwMDAwMS40P25jYmlfcGhpZD0zMjJDNTI0NzBCNjYyMEM1MDAwMDQyMDQ2RDE4NUU4MS4xLjEiLCJyZWdpb24iOiJ1cyIsInNlcnZpY2UiOiJncyIsInNpZ25pbmdBY2NvdW50Ijoic3JhX2dzIiwidGltZW91dCI6MTQ0MDB9.j0bReTs1dJtyQyYz0CEqb5Mv-K3qBbVXVSw4k8ZZs25LhFEymeXzCsJSINpmB3WQPM_CpIO9GkJuAvgvtickz0tOwyeHZvnroCU6mYdUP578lpDjDANQUNis653YNflsltbcV9MNsRG-Jjc3gB62V1_cpK99jVRPnJkPmwG0d8s2aJ-QFAiIPFxff_ikuBqc9rBHFfCnyswIa3WtTBmI1HDfed9y-7G9jG32-eBNnCFFvn69HmvaFAMmkT2i5o4Fpod4ZyeG0abf1qxqi9vzikxRxoU5fXXt6xkZhA2mfoHXGYefTznAmp6whncHGVVRLiSecIc8v0JNdr2g1e8qtw'...

It keeps in trying, adding longer and longer sleeps (1s, 2s, ...). Finally it fails completely.

srapath gives this info:

sa_113045741138414598170@makeshift-std32-1:~$ srapath SRR000001
https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbHJraWQxIiwidHlwIjoiSldUIn0.eyJhY2MiOiJTUlIwMDAwMDEiLCJleHAiOjE2NDQ0ODkwNjAsImZpbGVTaXplIjoiMzEyNTI3MDgzIiwiaWF0IjoxNjQ0NDc0NjYwLCJpZ25vcmVDZSI6InRydWUiLCJqdGkiOiJkNGM0YzM1MC1mMzgxLTRiZWEtOGJjZS0zMDQzNTE4OTNkYjQiLCJsaW5rIjoiaHR0cHM6Ly9zdG9yYWdlLmdvb2dsZWFwaXMuY29tL3NyYS1wdWItcnVuLTcvU1JSMDAwMDAxL1NSUjAwMDAwMS40P25jYmlfcGhpZD05MzlCMUExRDQyMUU2MTE1MDAwMDMwMUUyMEEyREM4OS4xLjEiLCJyZWdpb24iOiJ1cyIsInNlcnZpY2UiOiJncyIsInNpZ25pbmdBY2NvdW50Ijoic3JhX2dzIiwidGltZW91dCI6MTQ0MDB9.ecvK-5zfo4yKqEcarbiHi-GAvc4WwRsBwfZDi6h4i3BZ-XNNvBzYVAAtm_G2Gii5RpUkgCBQEPcljTYq98ZM2J-8qyPO0wXpbrM-eCGyK67DEaNYukja_5Sy187axMuulI0gH4v7hrGUU_GfnukXBOq09BNPlwGUQBBs5Iis_v6vyD1I-OhF1k7CGyV51GJjFacLxMVY1horbR_U29ARpK4Og3BeyYjJx3vofJ3lcVPhqCHAVv7NdQpbeQfTMju9Un8DMac68ZCeBl0OhZbJKzxjwq1UZ1tjEyfLcqILE8bTs2RwWoagt8IWA3fAQgDmFc6YKFEnlB6Mfe8BuFnYIw

fasterq-dump also failed with a slightly different error message:

sa_113045741138414598170@makeshift-std32-1:~$ fasterq-dump -v SRR000001
Preference setting is: Prefer SRA Normalized Format files with full base quality scores if available.
SRR000001 is an SRA Normalized Format file with full base quality scores.
2022-02-10T06:04:41 fasterq-dump.2.11.2 err: data unexpected while reading file within network system module - Cannot KStblHttpFileTimedReadChunked

Note that this issue occurs in older and newer versions of the toolkit (2.11.1 and 2.11.3). Alternatives I've tried (with similar results) include running prefetch through the NCBI docker image:

sudo docker run --rm -it ncbi/sra-tools:latest prefetch SRR000001

Let me know if there is any details you might need to reproduce/solve this bug. Alternatively, please refer me to instructions on how to download data within a GCP VM.

SMALL UPDATE ADDED 2/10. To reproduce the problem you can follow the instructions on https://edwards.flinders.edu.au/accessing-sra-in-the-cloud/. I managed to follow through the entire page (same with versions 2.11.3 and 2.10.2) and got the same results as the author of that page. But when running fastq-dump I get there is error:

sa_113045741138414598170@instance-1:~$ fastq-dump -L 6 SRR000001
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/fastq-dump'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/etc/ncbi'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi/vdb-copy.kfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi/vdb-copy.kfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi/default.kfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi/default.kfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi/certs.kfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/bin/ncbi/certs.kfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/.ncbi/user-settings.mkfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/.ncbi/user-settings.mkfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/.ncbi/user-settings.mkfg'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170/sratoolkit.2.10.2-ubuntu64/schema'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170'
2022-02-10T19:46:27 fastq-dump.2.10.2 debug: KSysDirResolvePath_v1 = '/home/sa_113045741138414598170'
2022-02-10T19:46:28 fastq-dump.2.10.2 info: HTTP read failure: URL="GET /sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd HTTP/1.1  Host: storage.googleapis.com  Cache-Control: no-cache, no-store, max-age=0, no-transform, must-revalidate  Expires: 0  Pragma: no-cache  Range: bytes=0-262143  User-Agent: linux64 ncbi-vdb.2.10.2 (phid=bGc336a513,libc=2.28)  Accept: */*    " status=400; tried 1/6 times for 0 milliseconds total
2022-02-10T19:46:28 fastq-dump.2.10.2 info: HTTP read failure: URL="GET /sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd HTTP/1.1  Host: storage.googleapis.com  Cache-Control: no-cache, no-store, max-age=0, no-transform, must-revalidate  Expires: 0  Pragma: no-cache  Range: bytes=0-262143  User-Agent: linux64 ncbi-vdb.2.10.2 (phid=bGc336a513,libc=2.28)  Accept: */*    " status=400; tried 2/6 times for 5 milliseconds total
2022-02-10T19:46:28 fastq-dump.2.10.2 info: HTTP read failure: URL="GET /sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd HTTP/1.1  Host: storage.googleapis.com  Cache-Control: no-cache, no-store, max-age=0, no-transform, must-revalidate  Expires: 0  Pragma: no-cache  Range: bytes=0-262143  User-Agent: linux64 ncbi-vdb.2.10.2 (phid=bGc336a513,libc=2.28)  Accept: */*    " status=400; tried 3/6 times for 15 milliseconds total
2022-02-10T19:46:28 fastq-dump.2.10.2 info: HTTP read failure: URL="GET /sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd HTTP/1.1  Host: storage.googleapis.com  Cache-Control: no-cache, no-store, max-age=0, no-transform, must-revalidate  Expires: 0  Pragma: no-cache  Range: bytes=0-262143  User-Agent: linux64 ncbi-vdb.2.10.2 (phid=bGc336a513,libc=2.28)  Accept: */*    " status=400; tried 4/6 times for 30 milliseconds total
2022-02-10T19:46:28 fastq-dump.2.10.2 info: HTTP read failure: URL="GET /sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd HTTP/1.1  Host: storage.googleapis.com  Cache-Control: no-cache, no-store, max-age=0, no-transform, must-revalidate  Expires: 0  Pragma: no-cache  Range: bytes=0-262143  User-Agent: linux64 ncbi-vdb.2.10.2 (phid=bGc336a513,libc=2.28)  Accept: */*    " status=400; tried 5/6 times for 60 milliseconds total
2022-02-10T19:46:29 fastq-dump.2.10.2 info: HTTP read failure: URL="GET /sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd HTTP/1.1  Host: storage.googleapis.com  Cache-Control: no-cache, no-store, max-age=0, no-transform, must-revalidate  Expires: 0  Pragma: no-cache  Range: bytes=0-262143  User-Agent: linux64 ncbi-vdb.2.10.2 (phid=bGc336a513,libc=2.28)  Accept: */*    " status=400; tried 6/6 times for 120 milliseconds total
2022-02-10T19:46:29 fastq-dump.2.10.2 info: HTTP read failure: URL="https://storage.googleapis.com/sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd" status=400; tried 1/6 times for 0 milliseconds total
2022-02-10T19:46:29 fastq-dump.2.10.2 info: HTTP read failure: URL="https://storage.googleapis.com/sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd" status=400; tried 2/6 times for 5 milliseconds total
2022-02-10T19:46:29 fastq-dump.2.10.2 info: HTTP read failure: URL="https://storage.googleapis.com/sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd" status=400; tried 3/6 times for 15 milliseconds total
2022-02-10T19:46:29 fastq-dump.2.10.2 info: HTTP read failure: URL="https://storage.googleapis.com/sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd" status=400; tried 4/6 times for 30 milliseconds total
2022-02-10T19:46:30 fastq-dump.2.10.2 info: HTTP read failure: URL="https://storage.googleapis.com/sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd" status=400; tried 5/6 times for 60 milliseconds total
2022-02-10T19:46:30 fastq-dump.2.10.2 info: HTTP read failure: URL="https://storage.googleapis.com/sra-pub-run-7/SRR000001/SRR000001.4?ncbi_phid=322C52470B6620C5000060645DE5D7DA.1.1%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256&X-Goog-Credential=data-access-service%40nih-sra-datastore.iam.gserviceaccount.com%2F20220210%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20220210T194628Z&X-Goog-Expires=360000&X-Goog-SignedHeaders=host&userProject=nih-sra-datastore&X-Goog-Signature=92789ab0e7cc108b859e08864c47d18b5fd7324f0c1924735e47aafd37f0be085c3bd053963dd03fab768fbe0eadd90a140b86a17219aad178769855b0c840598b0a6dd6d5a4e07c02b87c5b12a7e9d2ee79bd6adf5db9ba2ad083bfd9ac8c2033df3b7e87f74c5d4b99e7ed74869a2bd1fc2a4d7379c65392fe549e26b21a253d75fc461df800f47d6f1db7a4db3b646c5dd8766b0adb76718000d0bd35bf6d3abbe4fd7605890f78c6918f63844759b2ee8f828475ff616a42145f5ec059b08d7ed4a530eda440db965df29dd94d86c08c55bacb971a0f5a344b261f7f775552370825c3fc1af1cdd0f8bcf26298c109226603ae62890d38daa04433646dfd" status=400; tried 6/6 times for 120 milliseconds total
2022-02-10T19:48:07 fastq-dump.2.10.2 err: item not found while constructing within virtual database module - the path 'SRR000001' cannot be opened as database or table
fastq-dump (PID 581) quit with error code 3

Best, Eitan Yaffe

durbrow commented 2 years ago

Are you still affected by this? I was told it was a temporary condition.

eitanyaffe commented 2 years ago

Thanks for looking into this! I just checked (on version 2.11.3) and the problem persists.

Turns out that if I disable the 'report cloud instance identity' flag through vdb-config I manage to download files. However, they seem to be downloaded from Amazon instead of from buckets:

sa_113045741138414598170@instance-1:~$ fasterq-dump -v -v SRR000001
Preference setting is: Prefer SRA Normalized Format files with full base quality scores if available.
2022-02-17T20:31:24 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to metadata.google.internal (169.254.169.254) 
2022-02-17T20:31:24 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:31:24 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
SRR000001 is an SRA Normalized Format file with full base quality scores.
2022-02-17T20:31:24 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:24 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (54.231.133.105) 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:25 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:26 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:31:31 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to sra-pub-run-odp.s3.amazonaws.com (52.216.99.19) 

With the 'report cloud instance identity' flag enabled I get this:

sa_113045741138414598170@instance-1:~$ fasterq-dump -v -v SRR000001
Preference setting is: Prefer SRA Normalized Format files with full base quality scores if available.
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to metadata (169.254.169.254) 
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to metadata (169.254.169.254) 
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
SRR000001 is an SRA Normalized Format file with full base quality scores.
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:12 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - connected from '10.128.0.14' to locate.ncbi.nlm.nih.gov (130.14.29.113) 
2022-02-17T20:34:13 fasterq-dump.2.11.3: KClientHttpOpen - verifying CA cert 
2022-02-17T20:34:13 fasterq-dump.2.11.3 err: transfer incomplete while reading file within network system module - error with https open 'https://locate.ncbi.nlm.nih.gov/sdlr/sdlr.fcgi?jwt=eyJhbGciOiJSUzI1NiIsImtpZCI6InNkbHJraWQxIiwidHlwIjoiSldUIn0.eyJhY2MiOiJTUlIwMDAwMDEiLCJleHAiOjE2NDUxNDQ0NTIsImZpbGVTaXplIjoiMzEyNTI3MDgzIiwiaWF0IjoxNjQ1MTMwMDUyLCJqdGkiOiI1YzA2NjJmOS0yYWUzLTQyMmEtODJmYy0xNWNlNzI4YmNhNTUiLCJsaW5rIjoiaHR0cHM6Ly9zdG9yYWdlLmdvb2dsZWFwaXMuY29tL3NyYS1wdWItcnVuLTcvU1JSMDAwMDAxL1NSUjAwMDAwMS40IiwicmVnaW9uIjoidXMiLCJzZXJ2aWNlIjoiZ3MiLCJzaWduaW5nQWNjb3VudCI6InNyYV9ncyIsInRpbWVvdXQiOjE0NDAwfQ.E7mQlKLM0I_f_Vk2lSg4ghyLFjFgHorBCj9BUYArkfUIwW_hHlYnwUsho7fQAN7McVWgKEz4VqEv6RsECLSJuhEPcYAEglXIpr3yafJEGrvyuKQMIW5saRFhvd0VH-72xzm96QAsMBbFsG1FcR43mFDZ6Ts8uHhmr3L1VwUTA58w6hW0F8Tgv3eADwdNSbqzSQROU6YK3X4bMf_AbZ06D_8qZ0bmV5o6Z2MwMvo2VGuC8Emnz_CQYORWMZitc61iYzZe4x1cbLXT9NLUD1wcdWzxqdndk3K2dnLDm1mk54wfZkPpc8lUu9G7eqJR_sEuPY03ZKlBBi9GjKkfDaxbYQ&ncbi_phid=D0BD0BE048EAE375000051D7BC74186D.1.1'
2022-02-17T20:34:13 fasterq-dump.2.11.3 err: invalid accession 'SRR000001'
fasterq-dump quit with error code 3

To my understanding the downloading from GCP buckets is failing. This important feature can save funds and time, and is critical when scaling up workflows.

Let me know if I can assist further in any way.

Eitan

klymenko commented 2 years ago

We are investigating this issue. Meanwhile run vdb-config -i, uncheck report cloud instance identity and rerun your command.

eitanyaffe commented 2 years ago

Thanks Andrew! That works, and is a good solution until this issue is resolved.

wresch commented 2 years ago

I have a possible similar problem with fasterq-dump. Here are the specifics:

Test:

fasterq-dump -L debug --ngc /prj_phs710EA_test.ngc  -O /data/$USER/temp/test   SRR1219902

Output without debug logging (failure always occurs after approx 10m)

2022-03-21T15:58:04 fasterq-dump.3.0.0 err: data unexpected while reading file within network system module - Cannot KStblHttpFileTimedReadChunked
2022-03-21T16:07:51 fasterq-dump.3.0.0 fatal: SIGNAL - 8                                                                                          
fasterq-dump quit with error code 1                                                                                                               

Output:

2022-03-21T16:59:31 fasterq-dump.3.0.0 debug: path not found while opening node within configuration module - no image guid                               
2022-03-21T16:59:31 fasterq-dump.3.0.0 debug: path not found while opening node within configuration module - no image guid                               
2022-03-21T16:59:32 fasterq-dump.3.0.0 debug: requesting guard size 1038336, default was 4096                                                             
2022-03-21T16:59:32 fasterq-dump.3.0.0 info: HTTP read failure: URL="" status=403; tried 1/6 times for 0 milliseconds total                               
2022-03-21T16:59:32 fasterq-dump.3.0.0 info: HTTP read failure: URL="" status=403; tried 2/6 times for 5 milliseconds total                               
2022-03-21T16:59:32 fasterq-dump.3.0.0 info: HTTP read failure: URL="" status=403; tried 3/6 times for 15 milliseconds total                              
2022-03-21T16:59:32 fasterq-dump.3.0.0 info: HTTP read failure: URL="" status=403; tried 4/6 times for 30 milliseconds total                              
2022-03-21T16:59:32 fasterq-dump.3.0.0 info: HTTP read failure: URL="" status=403; tried 5/6 times for 60 milliseconds total                              
2022-03-21T16:59:32 fasterq-dump.3.0.0 info: HTTP read failure: URL="" status=403; tried 6/6 times for 120 milliseconds total                             
2022-03-21T16:59:33 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 1/6 times for 0 milliseconds total                                                       
2022-03-21T16:59:33 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 2/6 times for 5 milliseconds total                                                       
2022-03-21T16:59:33 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 3/6 times for 15 milliseconds total                                                      
2022-03-21T16:59:34 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 4/6 times for 30 milliseconds total                                                      
2022-03-21T16:59:34 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 5/6 times for 60 milliseconds total                                                      
2022-03-21T16:59:34 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 6/6 times for 120 milliseconds total                                                     
2022-03-21T16:59:35 fasterq-dump.3.0.0 info: HTTP read failure: URL="https://sra-ca-run-odp.s3.amazonaws.com/sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz
-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T165932Z&X-Amz-Expires=360
000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA84D50000327D410FF938.1.1&project_id=0&x-amz-request-payer=requester&X-Amz-Signature=41857e41bb7989fba5a
34cd48f6c946b6f130fe3ec77101a624dcf84b856b3da" status=403; tried 1/6 times for 0 milliseconds total                                                       
[...much more...]

on the same machine, when i do a curl --trace-ascii ascii.trace on one of the urls above I get a reasonable looking interaction:

== Info: About to connect() to proxy dtn01-e0 port 3128 (#0)                                    
== Info:   Trying 10.1.200.237...                                                               
== Info: Connected to dtn01-e0 (10.1.200.237) port 3128 (#0)                                    
== Info: Establish HTTP proxy tunnel to sra-ca-run-odp.s3.amazonaws.com:443                     
=> Send header, 154 bytes (0x9a)                                                                
0000: CONNECT sra-ca-run-odp.s3.amazonaws.com:443 HTTP/1.1                                      
0036: Host: sra-ca-run-odp.s3.amazonaws.com:443                                                 
0061: User-Agent: curl/7.29.0                                                                   
007a: Proxy-Connection: Keep-Alive                                                              
0098:                                                                                           
<= Recv header, 37 bytes (0x25)                                                                 
0000: HTTP/1.1 200 Connection established                                                       
<= Recv header, 2 bytes (0x2)                                                                   
0000:                                                                                           
== Info: Proxy replied OK to CONNECT request                                                    
== Info: Initializing NSS with certpath: sql:/etc/pki/nssdb                                     
== Info:   CAfile: /etc/pki/tls/certs/ca-bundle.crt                                             
  CApath: none                                                                                  
== Info: SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256                             
== Info: Server certificate:                                                                    
== Info:        subject: CN=*.s3.amazonaws.com,O="Amazon.com, Inc.",L=Seattle,ST=Washington,C=US
== Info:        start date: Dec 13 00:00:00 2021 GMT                                            
== Info:        expire date: Dec 13 23:59:59 2022 GMT                                           
== Info:        common name: *.s3.amazonaws.com                                                 
== Info:        issuer: CN=DigiCert Baltimore CA-2 G2,OU=www.digicert.com,O=DigiCert Inc,C=US   
=> Send header, 493 bytes (0x1ed)                                                               
0000: GET /sra/phs000710.c99/SRR1219902/SRR1219902?X-Amz-Algorithm=AWS                          
0040: 4-HMAC-SHA256&X-Amz-Credential=AKIA6PM54Q3MA6LQ4LOY%2F20220321%2                          
0080: Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220321T170539Z&X-Amz                          
00c0: -Expires=360000&X-Amz-SignedHeaders=host&ncbi_phid=939B8C751BDA8                          
0100: 4D500002F8149EE8E92.1.1&project_id=0&x-amz-request-payer=request                          
0140: er&X-Amz-Signature=ce160a28451d31b0a4a6c95bfe6fa9613df581232414e                          
0180: 22b6bd7a32a97981a0c HTTP/1.1                                                              
019e: User-Agent: curl/7.29.0                                                                   
01b7: Host: sra-ca-run-odp.s3.amazonaws.com                                                     
01de: Accept: */*                                                                               
01eb:                                                                                           
<= Recv header, 17 bytes (0x11)                                                                 
0000: HTTP/1.1 200 OK                                                                           
<= Recv header, 90 bytes (0x5a)                                                                 
0000: x-amz-id-2: Q1wGTX31cYvCyTpiQe9vg5lvbK0RYtihMxC2qHMHtmgtE3ZHUX8Q                          
0040: qxu6iWPKL8eccK3iX148x1s=                                                                  
<= Recv header, 36 bytes (0x24)                                                                 
0000: x-amz-request-id: R2PR423HKWJF8MAA                                                        
<= Recv header, 37 bytes (0x25)                                                                 
0000: Date: Mon, 21 Mar 2022 17:53:22 GMT                                                       
<= Recv header, 46 bytes (0x2e)                                                                 
0000: Last-Modified: Thu, 20 Jan 2022 23:19:59 GMT                                              
<= Recv header, 47 bytes (0x2f)                                                                 
0000: ETag: "592a94fd99b555b1180d000b916d913d-1773"                                             
<= Recv header, 24 bytes (0x18)                                                                 
0000: x-amz-tagging-count: 1                                                                    
<= Recv header, 22 bytes (0x16)                                                                 
0000: Accept-Ranges: bytes                                                                      
<= Recv header, 35 bytes (0x23)                                                                 
0000: Content-Type: binary/octet-stream                                                         
<= Recv header, 18 bytes (0x12)                                                                 
0000: Server: AmazonS3                                                                          
<= Recv header, 29 bytes (0x1d)                                                                 
0000: Content-Length: 14868512697                                                               
<= Recv header, 2 bytes (0x2)                                         
0000:                                                                 
<= Recv data, 1645 bytes (0x66d)                                      
0000: NCBI.sra.........'...........'......O.q...lock,.ES....$......md.
0040: .ES....m........"......cur..ES....$.....&......`!........md5..ES
0080: ....$............).........tbl..DS....m.........'....*.....PRIMA
00c0: RY_ALIGNMENT..ES....m....................col..ES....m........p..
0100: ...*.P.x.........&.K...GLOBAL_REF_START..ES....m.............#EX
0140: {....data..ES....$......$......4Z.......idx..ES....$............
[...much more...]
klymenko commented 2 years ago

@wresch, send questions regarding access to protected data to sra-tools@ncbi.nlm.nih.gov

wresch commented 2 years ago

OK - I'll try that. Thanks @klymenko

klymenko commented 2 years ago

@eitanyaffe, the issue id fixed. Please verify.

eitanyaffe commented 2 years ago

@klymenko -- I checked and it works well. Thank you so much for resolving this issue!

swati-mahapatra commented 2 years ago

We are investigating this issue. Meanwhile run vdb-config -i, uncheck report cloud instance identity and rerun your command.

Thank you! This worked for me! My commands were prefetch -v SRR000001 followed by fastq-dump -v SRR000001