Closed cthoyt closed 2 years ago
@tgbugs could you help us sort this issue out?
Could add a half-sentence to each of these to explain what they are? Are they all necessary?
NIFEXT: http://uri.neuinfo.org/nif/nifstd/nifext_ . NIFSTD: http://uri.neuinfo.org/nif/nifstd/ . NLX: http://uri.neuinfo.org/nif/nifstd/nlx_ . NLXANAT: http://uri.neuinfo.org/nif/nifstd/nlx_anat_ . NLXBR: http://uri.neuinfo.org/nif/nifstd/nlx_br_ . NLXCELL: http://uri.neuinfo.org/nif/nifstd/nlx_cell_ . NLXCHEM: http://uri.neuinfo.org/nif/nifstd/nlx_chem_ . NLXDYS: http://uri.neuinfo.org/nif/nifstd/nlx_dys_ . NLXFUNC: http://uri.neuinfo.org/nif/nifstd/nlx_func_ . NLXINV: http://uri.neuinfo.org/nif/nifstd/nlx_inv_ . NLXMOL: http://uri.neuinfo.org/nif/nifstd/nlx_mol_ . NLXOEN: http://uri.neuinfo.org/nif/nifstd/oen_ . NLXORG: http://uri.neuinfo.org/nif/nifstd/nlx_organ_ . NLXQUAL: http://uri.neuinfo.org/nif/nifstd/nlx_qual_ . NLXRES: http://uri.neuinfo.org/nif/nifstd/nlx_res_ . NLXSUB: http://uri.neuinfo.org/nif/nifstd/nlx_subcell_ .
This is necessary to organise our cross references on Uberon and standardise them across the community :)
Thank you!
Is @tgbugs the contact person for NIF? What's the difference between NIFEXT and NIFSTD? Do we need one for NLX or are all terms covered by the sub-terminologies?
@tgbugs (Tom) is contributing to a lot of different ontology projects and has been on the Uberon tracker often as well; I think he is the person to talk to about the NIF ontologies.
Yes, I'm the best point of contact for the NIF-Ontology. @smtifahim is also back with us and was around for the early days of the NIF-Ontology and neurolex.
NIFSTD
is the top level, kind of like OBO
is for obo foundry ontologies, NIFEXT
covers a subset of iris that were "external" identifiers that were brought into the ontology at some point in time, this was done before most of the current standard ontology and identifier management practices had been developed.
With regard to the other prefixes. In the early days of the NIF-Ontology there were separate files for major entity categories, and their identifier prefixes were usually matched to the file. This carried over to the early days of neurolex where individual entities were given type specific identifiers, this covers all the NLX*
style identifiers. At a certain point neurolex switched to using a single identifier sequence which did not differentiate between types, that is NLX
.
These are all needed but we are not minting new identifiers in these namespaces. There usually aren't that many identifiers for any given NLX*
prefix, but their curied form may have been referenced by someone without retaining the expansion rule, so having them for the record is important.
NIFEXT: http://uri.neuinfo.org/nif/nifstd/nifext_ . -> external
NIFSTD: http://uri.neuinfo.org/nif/nifstd/ . -> base (like obo:)
NLX: http://uri.neuinfo.org/nif/nifstd/nlx_ . -> generic neurolex, covers all types
NLXANAT: http://uri.neuinfo.org/nif/nifstd/nlx_anat_ . -> anatomy terms
NLXBR: http://uri.neuinfo.org/nif/nifstd/nlx_br_ . -> brain regions
NLXCELL: http://uri.neuinfo.org/nif/nifstd/nlx_cell_ . -> cell types
NLXCHEM: http://uri.neuinfo.org/nif/nifstd/nlx_chem_ . -> chemicals
NLXDYS: http://uri.neuinfo.org/nif/nifstd/nlx_dys_ . -> dysfunction
NLXFUNC: http://uri.neuinfo.org/nif/nifstd/nlx_func_ . -> cognitive function
NLXINV: http://uri.neuinfo.org/nif/nifstd/nlx_inv_ . -> investigations
NLXMOL: http://uri.neuinfo.org/nif/nifstd/nlx_mol_ . -> molecules
NLXOEN: http://uri.neuinfo.org/nif/nifstd/oen_ . -> the version of oen terms in neurolex
NLXORG: http://uri.neuinfo.org/nif/nifstd/nlx_organ_ . -> organ terms
NLXQUAL: http://uri.neuinfo.org/nif/nifstd/nlx_qual_ . -> qualities
NLXRES: http://uri.neuinfo.org/nif/nifstd/nlx_res_ . -> digital resources
NLXSUB: http://uri.neuinfo.org/nif/nifstd/nlx_subcell_ . -> subcellular entities e.g. GOCC
@cthoyt whats the best course of action? Given we also have an entry for obo, shouldn't there be one nifstd as well? https://bioregistry.io/registry/obo
In any case how do we add all these prefixes, do you have a form that takes a table for bulk submission?
No, the obo prefix in bioregistry is a mistake and I keep forgetting to remove it.
Bulk contribution guidelines: https://github.com/biopragmatics/bioregistry/blob/main/docs/CONTRIBUTING.md#bulk-contribution
I started preparing this, its all I can do now:
https://docs.google.com/spreadsheets/d/10MPt-H6My33mOa1V_VkLh4YG8609N7B_Dey0CBnfTL4/edit?usp=sharing
@tgbugs I attributed it all to you, would you mind filling in the red cells?
@matentzn @tgbugs in the mean time I implemented the code necessary to suck this google sheet up in #407. Would appreciate an update on this - I want to get these in ASAP
@matentzn @tgbugs I just browsed through https://raw.githubusercontent.com/SciCrunch/NIF-Ontology/master/ttl/generated/NIFSTD-ILX-mapping.ttl to fill out all of the example identifiers and guessed what the patterns should be, but I can't be sure that this file is a complete picture of these vocabularies so input would be appreciated.
The final 3 action items:
NIFSTD
, which I guess will have a lot more variety)With regard to 2 there are indeed quite a few other namespaces that live under NIFSTD
, to give only one example BIRNLEX
. I'm fairly certain that this is the full list but there might be a lurker or two.
If it's just a namespace that contains other namespaces, we can skip it for now.
Unfortunately there indeed some lurkers e.g. NIFSTD:FMA_83604
and NIFSTD:OBI_0000470
, and it does appear in other resources as well, e.g. uberon xrefs. Those would be terms that are not otherwise differentiated but that are managed inside of the NIF Ontology, it is sort of a fall through.
With regard to 1, most of namespaces preceded their community ontology counterparts, for example SAO was the original source for many of the GOCC terms. In other cases it is as you describe, where an external id was pulled into neurolex and its fragment was retained as is.
@tgbugs thanks a ton for weighing in on the prefix metadata! Looks great. Can I ask one additional favour from you: Would you be able for the description
to rewrite to a full English sentence like: "The X namespace covers entities of type X and Y, and is used for Z."
I think this would really help users to understand more quickly whats up with entities in these domains! Would you be willing to do that?
@matentzn updated. Let me know if they look ok.
Thank you @tgbugs!
NIF has tons of prefixes listed in here: https://raw.githubusercontent.com/SciCrunch/NIF-Ontology/master/ttl/generated/NIFSTD-ILX-mapping.ttl. Let's try and get them all in to Bioregistry so people can better understand this chaos (se also https://www.youtube.com/watch?v=3tM0Sow-2r8)
Originally posted by @matentzn in https://github.com/biopragmatics/bioregistry/issues/402#issuecomment-1141117750