EBI-Metagenomics / MGnifyR

R package for searching, downloading and analysis of EBI MGnify metagenomics data
Artistic License 2.0
19 stars 10 forks source link

Can we download .fna files of all MAGs from species catalogues through MGnifyR? #4

Closed yazhinia closed 2 years ago

yazhinia commented 2 years ago

Hi, I would like to know if any function exists to download all MAGs' .fna files from species catalogues. I find no clear instruction to do so in MGnifyR. Can you suggest possible ways by which I can download those files?

Thanks very much in advance.

beadyallen commented 2 years ago

Hi @yazhinia . Unfortunately the current implementation of MGnifyR doesn't really include significant MAG genome functionality. However, you could still use a combination of mgnify_query, mgnify_get_download_urls, and mgnify_download: e.g. the following retrieves the first page of genomes from the API, finds the FASTA urls, and then loops through them to download.

genomelist <- mgnify_query(mg, qtype="genomes", maxhits=10)
all_downloads <- mgnify_get_download_urls(mg, genomelist$accession, accession_type="genomes")

vec = grepl("Nucleic Acid Sequence", all_downloads$attributes.description.label)

fasta_files <- all_downloads[vec,"download_url"]

for (f in fasta_files){
   mg_download(mg, f)

Bear in mind that doing this for lots of MAGs will be slow...


yazhinia commented 2 years ago

Dear Ben, Thank you for the script.