Open TuomasBorman opened 7 months ago
@TuomasBorman the bit of code in the data portal itself that fetches the ENA file report, for display on the website only (not in the API) is in: portal_api.py. This might be helpful for implementing such a function in HoloFoodR...
Also, I think the metatranscriptomic raw datasets that are not yet analysed by MGnify (such as your example PRJEB66287) could be analysed and then counts tables retrieved from MGnify.
Also, I think the metatranscriptomic raw datasets that are not yet analysed by MGnify (such as your example PRJEB66287) could be analysed and then counts tables retrieved from MGnify.
In fact these are in progress, so once complete there will be MGnify analyses of these metaT samples.
OK, thanks @SandyRogers! Good to know.
As this package is for fetching the data for downstream analysis, the most optimal solution is fetching the counts table from MGnify (and not raw datasets).
I will keep this issue open, and once metaT is available:
HoloFoodR: update examples and this warning https://github.com/EBI-Metagenomics/HoloFoodR/blob/396153064f5eabe6497ced36a5dfeb011f33775b/R/getResult.R#L116
HoloFood data portal is now pointing to ENA, but it should then point also to MGnify? --> check that HoloFoodR is updated accordingly
MGnifyR should work already without any changes.
(Meta)-transcriptomics data is stored in European Nucleotide Archive, ENA. However, there is no counts table but only sequences (example).
As there are no counts tables, the data is not directly usable for downstream analysis. However, it still might be beneficial to have a function that fetches all the files to single directory for further processing. --> Consider adding
getENAFile
function.