Closed cbizon closed 2 years ago
I know of a couple of variant standards, one of which is HGVS. Another is SPDI: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7523648/, but maybe you are asking for a service that resolves any HGVS nomenclature onto a genome browser? I don't know of a service like that.
For the ID prefixes question: do you have HGVS names (ids) from TAIR? I found this service: https://uswest.ensembl.org/info/docs/tools/vep/recoder/index.html that converts between HGVS and SPDI, etc. And the resources that generate identifiers for variants: https://uswest.ensembl.org/info/genome/variation/species/sources_documentation.html
Chris M pointed out this resource: https://mutalyzer.nl/ for verifying syntax of variant nomenclature.
The Alliance of Genome Resources and maarvel.org both allow searching by HGVS, but unfortunately don't support plant species. https://www.alliancegenome.org/search?q=NC_000070.7%3Ag.101672390A%3EC
If TAIR or another plant database does curate variants, and has their own prefix for their resource, we can certainly add that prefix to the sequence variant class.
As far as I can tell (?) most of the plant dbs use identifiers that are probably pretty similar or transformable to SPDI. So I think we could make that work. Any thoughts on how we turn a SPDI into an identifier? Something like "SPDI:NG_012345.1:4:G:T"?
I'm not sure. Since I don't know of a service that will resolve an arbitrary identifier in SPDI format, or in HGVS format, probably we can't make a curie for it? If NCBI resolved them, then I could see something like ncbi.spdi:NG_012345.1:4:G:T
, or if alliancegenome.org resolved them, then agrkb:NG_012345.1:4:G:T
?
shall we float an issue on the bioregistry tracker? it may be deemed out of scope but would be good to get feedback from others there
On Mon, Jul 18, 2022 at 5:16 PM Sierra Moxon @.***> wrote:
I'm not sure. Since I don't know of a service that will resolve an arbitrary identifier in SPDI format, or in HGVS format, probably we can't make a curie for it? If NCBI resolved them, then I could see something like ncbi.spdi:NG_012345.1:4:G:T, or if alliancegenome.org resolved them, then agrkb:NG_012345.1:4:G:T ?
— Reply to this email directly, view it on GitHub https://github.com/biolink/biolink-model/issues/1042#issuecomment-1188459416, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOMFGMAGT4P3PVIAE5DVUXXUTANCNFSM5ZDJ5HPA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
It's preferable that the identifier be resolvable, but is it strictly necessary?
No strong opinion but I do think it’s important that the prefix be registered somehow
On Tue, Jul 19, 2022 at 6:22 AM cbizon @.***> wrote:
It's preferable that the identifier be resolvable, but is it strictly necessary?
— Reply to this email directly, view it on GitHub https://github.com/biolink/biolink-model/issues/1042#issuecomment-1189048167, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOJRGI6CEYTUDFLBJ2TVU2T2PANCNFSM5ZDJ5HPA . You are receiving this because you commented.Message ID: @.***>
ok - issue is made at bioregistry (see link above) - some discussion will likely happen there. I could add spdi
as a prefix with its URL pointing to the SPDI API here: https://api.ncbi.nlm.nih.gov/variation/v0/ to biolink-model directly for now. Wil that fix your use case @cbizon? The IDs won't really resolve except to give a more structured JSON response to a query by SPDI id?
Yes, I think that will meet our current use case, thanks!
Is your feature request related to a problem? Please describe. We want to ingest sequence variants for various plants like Arabadopsis, but we don't know how to represent them. None of the id_prefixes for sequence variant are appropriate.
Describe the solution you'd like I'm not certain... Is there a standard way to write HGVS-like identifiers?
What working group (or team) did this request originate from? ROBOKOP
Tag relevant members for discussion @shalsh23