MaastrichtU-IDS / semanticscience

The Semanticscience Integrated Ontology (SIO) provides a simple, integrated ontology of types and relations for rich description of objects, processes and their attributes.
https://sio.semanticscience.org
61 stars 11 forks source link

Please use standard URI namespace prefixes #16

Closed jpmccu closed 8 years ago

jpmccu commented 8 years ago

OWL is set to use the "ns1" prefix, for instance.

ElisaKendall commented 8 years ago

Currently, the set of namespace prefixes defined in the header of the ontology includes:

xmlns:ns0="http://www.w3.org/2002/07/owl#" xmlns:ns1="http://purl.org/dc/elements/1.1/" xmlns:ns2="http://semanticscience.org/resource/" xmlns:ns3="http://purl.org/vocab/vann/" xmlns:ns4="http://purl.org/spar/cito/" xmlns:ns5="http://protege.stanford.edu/plugins/owl/protege#" xmlns:ns6="http://purl.org/dc/terms/" xmlns:ns7="http://www.w3.org/2000/01/rdf-schema#" xmlns:ns8="http://www.w3.org/2001/XMLSchema#"

ns0 should be "owl" ns1 should be "dc" ns2 should be "sio" (or whatever you want everyone to use) ns3 should be "vann" (as stated on the web site for the ontology) ns4 should be "cito" ns5 should be removed -- the only thing it is used for is to state the default language of the ontology, for which you can use Dublin Core language, and eliminate the dependency on Protege ns6 should be "dct" or "terms", but preferably "dct" ns7 should be "rdfs" ns8 should be "xsd"

Most of these are the standard namespaces for these vocabularies, and it's very confusing to see something else to new users.

micheldumontier commented 8 years ago

why does this matter? tools could render standard prefixes, wherever those are defined.

ElisaKendall commented 8 years ago

Not all tools understand this, and in cases where you have many ontologies in an imports chain, use of consistent prefixes is really the only way to ensure that everyone is talking about the same thing. Some tools don't allow for redefinition of prefixes -- it's fine if you are working in a triple store, not great if you're writing rules or working with OWL reasoners, etc., and really bad if you're working in UML (which I do on occasion).

ElisaKendall commented 8 years ago

In other words, the prefix needs to be the same for everything in an imports chain, and if sio is at the top and redefines all of the well known prefixes, everyone else who imports it has to do the same, using your numeric prefixes rather than those that they know - at least in some tools.

ansell commented 8 years ago

Prefixes are not formally defined in all syntaxes though, so if anywhere in the chain uses a syntax where the prefix disappears (N-Triples/N-Quads for instance) you are doomed anyway.

Better to fix your toolchain to source prefixes from prefix.cc (or similar) and override any that are in the documents that you find.

micheldumontier commented 8 years ago

I don't agree. the prefix is simple syntactic sugar for uri expansion. any tool worth its salt should be able to merge documents regardless of the prefix used. there's no specification for this anyways. sorry!

micheldumontier commented 8 years ago

in any case, i think that Jim is adding prefix support to the export script ;)

ElisaKendall commented 8 years ago

We would appreciate it -- many of the people we train are not only not geeky, but are subject matter experts without the kind of programming ability that would permit them to do some of these things. And I understand that it's "just a serialization" feature, but you would be surprised at how often we need to really dig into the text files that contain the ontologies to figure out who did what and where something came from.

If you always and only work in a graph database or triple store, then fine, but that's rarely the community that we work with - physicians, bankers, other kinds of analysts who are very good in their respective fields but not at understanding this stuff. And the focus for us is on the ontology development specifically, with other folks downstream loading them in various tools. There are far fewer ontology tools that support the kind of provenance and other annotations that we need to incorporate, and the folks we're talking about are often metadata wonks. When your metadata, including prefixes, is not consistent, they get really annoyed, even if it's strictly a "serialization" problem. So, our audience is entirely different from what you may be used to.

dlmcguinness commented 8 years ago

I think it is important to not count on all tools to use prefix.cc (or any prefix registry). And also not count on being able to fix the tool chain in a timely manner, if ever. Some like protege are fairly heavily used and I expect will not be using prefix.cc (and depending on using a registry is not safe for other reasons - any registry is not likely to be always up to date and also not able to or likely to contain information about ontologies behind firewalls for example even when those ontologies are very well used within some group like a government group).

We are looking at these issues because we want to reuse sio in a broader community that uses a wide range of tools and many of those tools are not under the community's control. It would help a lot in making this more reusable by people who are not CS trained / are not hackers / have no time or ability to request tool chain fixes. Also we are looking at this as we are teaching best practices in ontology engineering and providing ontologies that are easier to read by a wide range of users is one thing we teach.

micheldumontier commented 8 years ago

committed. 08cadb0
in next release