Closed durrantmm closed 2 months ago
I wrote some code to retrieve this sort of data from ENA using their API. I just made it available here: https://github.com/apcamargo/retrieve-ena-metadata.
The metadata you'll get includes the tax_id
field. The repository contains a notebook with an example at the end, showing how to parse a taxonomy ID to get the full lineage. It did take a few days to finish, though. The fastest solution is probably NCBI BigQuery.
Thanks! I decided to just go with entrez and got all the information I needed.
Hello, I was hoping to map all Logan files to their species of origin so I could do filtered downloads. Do you have these data on hand already by chance? If not, is there a faster way to access this data for all for the ~26 million files than Entrez?