EBI-Metagenomics / MGnifyR

R package for searching, downloading and analysis of EBI MGnify metagenomics data
https://ebi-metagenomics.github.io/MGnifyR/
Artistic License 2.0
19 stars 10 forks source link

Locate fastq data associated with MGnify analyses #8

Open s-meaden opened 2 years ago

s-meaden commented 2 years ago

Adding this as an issue in case anyone has a similar problem.

MGnifyR::get_download_urls() doesn't return links for the read data (fastq) used to generate assemblies/analyses, but this may be useful for some scenarios.

A workaround is to run MGnifyR::mgnify_get_analyses_metadata() then use the ENA accessions returned under the sample_biosample column.

These accessions can be used with the enaGroupGet command from the enaBrowserTools scripts, e.g.: enaGroupGet -f fastq SAMN04360062

Note that the enaDataGet throws an error and doesn't find the accession (for me at least).

Thanks @beadyallen for the package!

TuomasBorman commented 2 months ago

TODO:

  1. Check if raw sequences can be found from MGnify (@SandyRogers do you have direct answer for this?)
  2. If they cannot be found, consider adding a method that gets these files from ENA.