MetaSUB / metasub_utils

MIT License
8 stars 1 forks source link

Unable to download MetaSUB data using Pangea #16

Open pierrepeterlongo opened 1 year ago

pierrepeterlongo commented 1 year ago

Hello,

I’m struggling for downloading the MetaSUB clean reads. I’ve been testing several possibilities, but none worked properly.

With Pangea, I’m facing this issue: If I use the “MetaSUB” group name, some fastq.gz files (2240 x2) are downloaded in 2240 directories. However, some of them are corrupted, which is not the main problem.

pangea-api download sample-results  -e 'pierre.peterlongo@inria.fr'  --module-name 'cap2::clean_reads' 'MetaSUB Consortium' 'MetaSUB'

However the main issue comes when trying to download the other groups such as Doha in this example:

pangea-api download sample-results  -e 'pierre.peterlongo@inria.fr'  --module-name "raw::raw_reads" "MetaSUB Consortium" "MetaSUB Doha"

Nothing happens, the program stops after few seconds.

I also tried with the code provided in https://pangeabio.io/sample-groups/15824754-8b3c-4033-8ba3-b153e77fa47f/downloads

pangea-api download sample-results -e 'pierre.peterlongo@inria.fr' --module-name 'cap2::clean_reads' 'MetaSUB Consortium'  '15824754-8b3c-4033-8ba3-b153e77fa47f'   

Here the program stops with the following error

pangea_api.remote_object.RemoteObjectOverwriteError: Loading blob would overwrite field "name":
        current: "15824754-8b3c-4033-8ba3-b153e77fa47f" (type: "<class 'str'>")
        new:     "MetaSUB Doha" (type: "<class 'str'>")

Finally I’ve tried the solution mentioned here: https://gist.github.com/dcdanko/c70304e5eb9c20fc81111929598edda0 (linked from https://www.pangeabio.io/docs/how-to-download-data) changing ‘https://pangea.gimmebio.com' for ‘https://pangeabio.io' but nothing happens with any other group name than “MetaSUB” (from what I tested).

Do you have any recommendation on how I could download data, ideally all data listed in this Table: https://github.com/ratschlab/metagraph_paper_resources/blob/master/data_tables/TableS6_MetaSUB.csv.gz ?

Thanks a lot for your work and your time,