IGS / portal_client

Python-based client for downloading data made available through portals powered by the GDC-based portal system..
MIT License
16 stars 17 forks source link

Error using manifest file, error message: Skipping file ID <file_id> as none of the URLs ['FASP'] succeeded. #15

Open Yingcui2018 opened 4 years ago

Yingcui2018 commented 4 years ago

Hi, I am executing the portal_client tool (with my iHMP username and password) to download the files using a manifest file as follows: portal_client -m hmp_cart_e0473170e2.tsv --endpoint-priority FASP --user username

And the content of hmp_cart_e0473170e2.tsv file is below: file_id md5 size urls sample_id 7cfd74d6803ea319683df7564459a079 ab27b58083c30b884936ee396b0ee223 29585 fasp://aspera.ihmpdcc.org/ibd/genome/microbiome/wgs/analysis/hmscp/MSM9VZMS_taxonomic_profile.biom 7cfd74d6803ea319683df7564459673c ...... However, I also get an error message as follows: Skipping file ID 8ef4204e3058e7d0f909de1f96016f5a as none of the URLs ['FASP'] succeeded. Skipping file ID 8ef4204e3058e7d0f909de1f9600e0c7 as none of the URLs ['FASP'] succeeded. Skipping file ID 8ef4204e3058e7d0f909de1f9600b107 as none of the URLs ['FASP'] succeeded. Skipping file ID 8ef4204e3058e7d0f909de1f9601599d as none of the URLs ['FASP'] succeeded. ...... Not all files (total of 81) were downloaded successfully. Number of failures: 0 -- no valid URL in the manifest file 81 -- URL is present in manifest, but not accessible at the location specified 0 -- MD5 checksum failed for file (file is corrupted or the wrong MD5 is associated)

Then I added the --debug patameter , and the error information is : 2019-12-09 15:18:19,871 - root - DEBUG - Creating ManifestProcessor. 2019-12-09 15:18:19,895 - boto - DEBUG - Retrieving credentials from metadata server. 2019-12-09 15:18:20,904 - boto - ERROR - Caught exception reading instance data .... 2019-12-09 15:25:43,569 - aspera - ERROR - Aspera authentication failure. 2019-12-09 15:25:43,569 - manifest_processor.ManifestProcessor - ERROR - Aspera transfer failed. 2019-12-09 15:25:43,570 - manifest_processor.ManifestProcessor - DEBUG - Returning error

Can you please check if my command is correct? Thanks, yingc

mstambou commented 4 years ago

hi @Yingcui2018 did you figure out what the problem is? I ran in into the same problem. I can't believe such a big project have come up with such horrible API to download the files, overly complicating things and making these datasets almost unusable.

clarkjs237 commented 4 years ago

Hi @Yingcui2018 and @mstambou, did either of you figure out the solution to this? I'm experiencing the same issue and it's really frustrating.

mstambou commented 4 years ago

hi @clarkjs237 not yet unfortunately. I agree it is very frustrating. There's a known issue with aspera where you have to change all the aspera2's to aspera in the manifest. However even after changing that I still was getting that same error and not downloading anything. I contacted them they have some problems apparantly with the database and are working to fix it. in my case I am trying to download proteomics datasets but the file size column in the manifest is zero for some reason. Now I do not know if this is the case for all the other datasets as well (i.e. genomics, transcriptomics metabolomics etc.) what type of datasets did you try to download?

clarkjs237 commented 4 years ago

hi @mstambou. I tried changing all the aspera2's to aspera in the manifest and no change, same error. I've been trying to download both FASTQ files and Biological Observation Matrix files as FASP files. I've been able to successfully download some other data as HTTP files but it's not the data I'm looking for. I'm new to programming with cloud storage or Aspera and have found information elsewhere showing people trying to solve this same problem. I've been debugging and searching every error thrown but ultimately don't really know what to do.

mstambou commented 4 years ago

I see, yes this really sucks. I don't understand why did they overly complicate stuff while simple FTP sites work great. They made and API portal which is broken and defeats the whole purpose.

ManarRashad commented 4 years ago

Hi all, if anyone of you found the solution as I tried alot to download the biological observation matrix and this error always appear: Not all files (total of 2308) were downloaded successfully. Number of failures: 0 -- no valid URL in the manifest file 5 -- URL is present in manifest, but not accessible at the location specified 10 -- MD5 checksum failed for file (file is corrupted or the wrong MD5 is associated)

I tried alot and each time not all files are downloaded just 2207 out of 2308 are downloaded and others not. By the way, my manifest file like this form: 9dc112963819ad139611820d22966777 2821344a48240f0fb288e6f0f016d7c5 86000 fasp://aspera.ihmpdcc.org/t2d/genome/microbiome/16s/analysis/hmqcp/HMP2_J79630_1_NS_T0_B0_0120_ZRLZ98T-6031_B9GWL.biom 5a950f27980b5d93e4c16da124908ee3 Anyone can help???

oakeseh commented 2 years ago

I am also experiencing the same issue.

annecarolm commented 4 months ago

Same issue! Anyone got help yet?