IGS / portal_client

Python-based client for downloading data made available through portals powered by the GDC-based portal system..
MIT License
18 stars 17 forks source link

HTTPS links in manifest don't work, no other links present #30

Open ifirth opened 1 month ago

ifirth commented 1 month ago

Hello,

I am trying to download some HMP transcriptomic data, and the links provided in my manifest are giving me some trouble. I have tried downloading from other manifests, such as the example http_manifest.tsv and a small metagenomic manifest. The http_manifest.tsv works, and the metagenomic manifest works, but only using the s3 link. My problem is that the manifest I am trying to download has only the HTTPS links. I have seen in other issues ( #27, #28) that this is a problem others have encountered. I have also tried the single file download solutions mentioned in the above issues, with the same result of no download/error 403. I understand that the HTTPS links are retired, I am just unclear on how to obtain other links when none are provided in the manifest file.

I am wondering if there is a way to get the S3 or other link type from the https links that I have, or how I can otherwise access these files. I have provided a subset of my manifest below.

file_id md5 size urls sample_id b60906da3dbfec7c28e20dcb46029297 26accfb3cce5e772444344642d5d8f94 9862912000 https://downloads.hmpdacc.org/ihmp/t2d/transcriptome/microbiome/raw/HMP2_J17759_R_ST_T0_B0_0120_ZOZOW1T-2001_HC7NYBGXX.raw.fastq.tar d57eb430d669de8329be1769d4d8633f b60906da3dbfec7c28e20dcb4600cd4f efbb51c9d011b15425084c5ac43c8d60 611819520 https://downloads.hmpdacc.org/ihmp/t2d/transcriptome/microbiome/raw/HMP2_J27246_R_ST_T0_B0_0120_ZOZOW1T-62_AJHDT.raw.fastq.tar 932d8fbc70ae8f856028b3f67c534b57 b60906da3dbfec7c28e20dcb4600ec30 8718fcf687883e3857e59ef9b6141cda 661493760 https://downloads.hmpdacc.org/ihmp/t2d/transcriptome/microbiome/raw/HMP2_J27246_R_ST_T0_B0_0120_ZOZOW1T-62_AGHU5.raw.fastq.tar 932d8fbc70ae8f856028b3f67c534b57 b60906da3dbfec7c28e20dcb46015950 23efb87d8179f9a39ecf47e29ad15fd6 948285440 https://downloads.hmpdacc.org/ihmp/t2d/transcriptome/microbiome/raw/HMP2_J27242_R_ST_T0_B0_0120_ZOZOW1T-58b_AGHU5.raw.fastq.tar 932d8fbc70ae8f856028b3f67c531418 b60906da3dbfec7c28e20dcb4601813c d066a655a4afb4bad5691ade1ddc6c71 1170452480 https://downloads.hmpdacc.org/ihmp/t2d/transcriptome/microbiome/raw/HMP2_J27242_R_ST_T0_B0_0120_ZOZOW1T-58b_AJHDT.raw.fastq.tar 932d8fbc70ae8f856028b3f67c531418

Thanks, Isaac

nsuvarnaiari commented 1 month ago

Hi Isaac,

The hmp and ihmp data is freely available for download from our AWS S3 bucket. Bucket name: hmpdcc

Here are the commands to run for the files you requested (NOTE: component files of the tars will be downloaded, not tars themselves)

aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J17759_R_ST_T0_B0_0120_ZOZOW1T-2001_HC7NYBGXX_S22_R1.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J17759_R_ST_T0_B0_0120_ZOZOW1T-2001_HC7NYBGXX_S22_R2.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27246_R_ST_T0_B0_0120_ZOZOW1T-62_AJHDT_L001_R1.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27246_R_ST_T0_B0_0120_ZOZOW1T-62_AJHDT_L001_R2.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27246_R_ST_T0_B0_0120_ZOZOW1T-62_AGHU5_L001_R1.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27246_R_ST_T0_B0_0120_ZOZOW1T-62_AGHU5_L001_R2.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27242_R_ST_T0_B0_0120_ZOZOW1T-58b_AGHU5_L001_R1.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27242_R_ST_T0_B0_0120_ZOZOW1T-58b_AGHU5_L001_R2.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27242_R_ST_T0_B0_0120_ZOZOW1T-58b_AJHDT_L001_R1.fastq.bz2 . --no-sign-request
aws s3 cp s3://hmpdcc/ihmp/t2d/microbiome/metatranscriptome/raw/HMP2_J27242_R_ST_T0_B0_0120_ZOZOW1T-58b_AJHDT_L001_R2.fastq.bz2 . --no-sign-request

Thanks, Suvvi

ifirth commented 1 month ago

Hi Suvvi, Thank you for providing the links for those files. My full manifest has 835 files in it. Is there any way to get all the files I am looking for without running 1670 versions of the command that you wrote above? Or to change the links into s3 links based on a pattern? Just looking for an efficient way to get all of my files in the manifest.

Thanks again, Isaac