ncbi / datasets

NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
https://www.ncbi.nlm.nih.gov/datasets
Other
369 stars 41 forks source link

Accessing annotation from nuccore sequences #348

Closed manulera closed 7 months ago

manulera commented 7 months ago

Hello,

I was wondering if it is planned to ever be able to access annotations in non-assembly sequences. I am thinking of the particular example of pombe, where the mitochondrial chromosome is currently not part of the assembly (more on this issue).

I guess this can be problematic since probably the API relies on the presence of unique locus_tags on features in the annotated sequences, and nuccore sequences don't have these (e.g. MK618072.1, despite it being a mitochondrial chromosome).

olearyna commented 7 months ago

Hi manulera,

We are currently developing a new organelle data package, which will feature organelles not that are not part of an assembled genome. Initially, our focus will be on RefSeq annotated organelles, with plans to integrate GenBank organelles (such as MK618072.1) at a later stage.

Thanks, Nuala