draeger-lab / ModelPolisher

ModelPolisher accesses the BiGG Models knowledgebase to annotate SBML models.
MIT License
23 stars 7 forks source link

Handling ncbigi annotations #86

Closed mephenor closed 2 years ago

mephenor commented 4 years ago

ncbigi annotations are no longer supported by identifiers.org. As far as i understood, the direct URL https://www.ncbi.nlm.nih.gov/protein/ could still be used to resolve them. Should this be implemented?

draeger commented 4 years ago

Is there a clear reason why the identifiers.org team does no longer support this kind of IDs? I'd suggest contacting them and ask. As a general remark, annotation in SBML is not restricted to the use of identifiers.org references. Any valid URL can be attached to a qualifiers within a controlled-vocabulary term. The only advantage of the identifiers.org is that their IDs follow a common structure and that the maintainers guarantee stable "resolveability."

mephenor commented 4 years ago

I haven't asked the identifiers team yet, as the apparent reason is described in one of the posts at https://ncbiinsights.ncbi.nlm.nih.gov/tag/gi/, where it is mentioned that

[...] more and more new sequence records will not be assigned a GI number, and so will never be retrievable using GI methods. But records that currently have a GI will always have that GI.

and accession.version should be used instead. They also link to the original announcement regarding this change https://www.ncbi.nlm.nih.gov/books/NBK431010/#news_03-02-2016-phase-out-of-GI-numbers .

mephenor commented 4 years ago

For now I've kept the ncbigi URL as is, as they can still be resolved, event thought they are not supported anymore and the pattern cannot be validated.

draeger commented 4 years ago

Or would it be possible to obtain the corresponding accession.version entry for a given GI? If so, we could use that...

matthiaskoenig commented 4 years ago

The important thing for identifiers.org links is that they are matching the defined regular expression patterns in the MIRIAM registry. As long as this is the case they are valid. If it is not possible to do this it is from my perspective perfectly fine to use direct URLs (they have the caveat of not being resolvable at some point, but it is much better then putting an invalid identifiers.org identifier for a resource.