EBI-Metagenomics / MGnifyR

R package for searching, downloading and analysis of EBI MGnify metagenomics data
https://ebi-metagenomics.github.io/MGnifyR/
Artistic License 2.0
19 stars 10 forks source link

Can we download .fna files of all MAGs from species catalogues through MGnifyR? #4

Closed yazhinia closed 2 years ago

yazhinia commented 2 years ago

Hi, I would like to know if any function exists to download all MAGs' .fna files from species catalogues. I find no clear instruction to do so in MGnifyR. Can you suggest possible ways by which I can download those files?

Thanks very much in advance.

beadyallen commented 2 years ago

Hi @yazhinia . Unfortunately the current implementation of MGnifyR doesn't really include significant MAG genome functionality. However, you could still use a combination of mgnify_query, mgnify_get_download_urls, and mgnify_download: e.g. the following retrieves the first page of genomes from the API, finds the FASTA urls, and then loops through them to download.

genomelist <- mgnify_query(mg, qtype="genomes", maxhits=10)
all_downloads <- mgnify_get_download_urls(mg, genomelist$accession, accession_type="genomes")

vec = grepl("Nucleic Acid Sequence", all_downloads$attributes.description.label)

fasta_files <- all_downloads[vec,"download_url"]

for (f in fasta_files){
   mg_download(mg, f)
}

Bear in mind that doing this for lots of MAGs will be slow...

Ben

yazhinia commented 2 years ago

Dear Ben, Thank you for the script.