Unable to download MetaSUB data using Pangea

Hello,

I’m struggling for downloading the MetaSUB clean reads. I’ve been testing several possibilities, but none worked properly.

With Pangea, I’m facing this issue: If I use the “MetaSUB” group name, some fastq.gz files (2240 x2) are downloaded in 2240 directories. However, some of them are corrupted, which is not the main problem.

pangea-api download sample-results  -e 'pierre.peterlongo@inria.fr'  --module-name 'cap2::clean_reads' 'MetaSUB Consortium' 'MetaSUB'

However the main issue comes when trying to download the other groups such as Doha in this example:

pangea-api download sample-results  -e 'pierre.peterlongo@inria.fr'  --module-name "raw::raw_reads" "MetaSUB Consortium" "MetaSUB Doha"

Nothing happens, the program stops after few seconds.

I also tried with the code provided in https://pangeabio.io/sample-groups/15824754-8b3c-4033-8ba3-b153e77fa47f/downloads

pangea-api download sample-results -e 'pierre.peterlongo@inria.fr' --module-name 'cap2::clean_reads' 'MetaSUB Consortium'  '15824754-8b3c-4033-8ba3-b153e77fa47f'

Here the program stops with the following error

pangea_api.remote_object.RemoteObjectOverwriteError: Loading blob would overwrite field "name":
        current: "15824754-8b3c-4033-8ba3-b153e77fa47f" (type: "<class 'str'>")
        new:     "MetaSUB Doha" (type: "<class 'str'>")

Finally I’ve tried the solution mentioned here: https://gist.github.com/dcdanko/c70304e5eb9c20fc81111929598edda0 (linked from https://www.pangeabio.io/docs/how-to-download-data) changing ‘https://pangea.gimmebio.com' for ‘https://pangeabio.io' but nothing happens with any other group name than “MetaSUB” (from what I tested).

Do you have any recommendation on how I could download data, ideally all data listed in this Table: https://github.com/ratschlab/metagraph_paper_resources/blob/master/data_tables/TableS6_MetaSUB.csv.gz ?

Thanks a lot for your work and your time,

MetaSUB / metasub_utils

Unable to download MetaSUB data using Pangea #16