EBI-Metagenomics / HoloFoodR

R interface for HoloFood resource
https://ebi-metagenomics.github.io/HoloFoodR/
Artistic License 2.0
1 stars 1 forks source link

Add support for ENA #26

Open TuomasBorman opened 6 months ago

TuomasBorman commented 6 months ago

(Meta)-transcriptomics data is stored in European Nucleotide Archive, ENA. However, there is no counts table but only sequences (example).

As there are no counts tables, the data is not directly usable for downstream analysis. However, it still might be beneficial to have a function that fetches all the files to single directory for further processing. --> Consider adding getENAFile function.

SandyRogers commented 6 months ago

@TuomasBorman the bit of code in the data portal itself that fetches the ENA file report, for display on the website only (not in the API) is in: portal_api.py. This might be helpful for implementing such a function in HoloFoodR...

SandyRogers commented 6 months ago

Also, I think the metatranscriptomic raw datasets that are not yet analysed by MGnify (such as your example PRJEB66287) could be analysed and then counts tables retrieved from MGnify.

SandyRogers commented 6 months ago

Also, I think the metatranscriptomic raw datasets that are not yet analysed by MGnify (such as your example PRJEB66287) could be analysed and then counts tables retrieved from MGnify.

In fact these are in progress, so once complete there will be MGnify analyses of these metaT samples.

TuomasBorman commented 6 months ago

OK, thanks @SandyRogers! Good to know.

As this package is for fetching the data for downstream analysis, the most optimal solution is fetching the counts table from MGnify (and not raw datasets).

I will keep this issue open, and once metaT is available: