ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

versionIRI not present for many ontologies #168

Open cmungall opened 4 years ago

cmungall commented 4 years ago

For #167 we need versionIRI to be tracked by BioPortal

What is the procedure for generating the value for the 'version' column on an ontology's from page? I am inferring:

This is not great for OBO. We regard the canonical version of all ontologies as being the main owl purl. Also we do not require versionInfo, but we do require versionIRI.

Perhaps we should use versionInfo more widely but in the interim it would be good if BP used versionIRI

See http://obofoundry.org/principles/fp-003-uris.html

balhoff commented 4 years ago

Perhaps we should use versionInfo more widely but in the interim it would be good if BP used versionIRI

Either way I think BioPortal should use the version IRI as the version for the ontology. This is standard for OWL 2 and also is (hopefully) dereferenceable to directly obtain that version. owl:versionInfo is from OWL 1 and is described in the OWL 2 spec as an "annotation property [that] can be used to provide an IRI with a string that describes the IRI's version."

jonquet commented 4 years ago

BioPortal's current metadata model has no field for the versionIRI. It has a field for version information (omvversion) but nothing for versionIRI.

jvendetti commented 4 years ago

Here is information about how BioPortal determines what is shown in the "Version" column for any given ontology:

1). For OBO ontologies, we check the "data-version" document header tag:

https://github.com/ncbo/owlapi_wrapper/blob/1d2ab90a9a6db0540d6756d1ebae2ca7a304a4f2/src/main/java/org/stanford/ncbo/oapiwrapper/OntologyParser.java#L74

2). For OWL ontologies, we check owl:versionInfo:

https://github.com/ncbo/owlapi_wrapper/blob/1d2ab90a9a6db0540d6756d1ebae2ca7a304a4f2/src/main/java/org/stanford/ncbo/oapiwrapper/OntologyParser.java#L165

If neither of those exist, the version is considered unknown.

The code for determining versions was likely developed during the OWL 1 timeframe.

cmungall commented 4 years ago

Thanks, that is what I assumed.

How hard would it be to adopt the BP datamodel to the OWL2 standard? Or perhaps it can just be treated as an ontology annotation as @balhoff suggests?

jvendetti commented 4 years ago

I think modifying BioPortal is doable.

I'm unclear though on what the display of a version number should look like if we move to using the Version IRI. Would the end user want to see the entire IRI in the Version column in our Submissions table?

Screen Shot 2020-05-12 at 9 50 44 AM

Or, is the expectation that they would see some particular substring from the IRI? I looked at the OWL 2 spec, which gives an example like:

ontology IRI: http://www.example.com/my
version IRI:  http://www.example.com/my/2.0

... where the version is the last bit of the IRI. This is similar to what Protege auto suggests when you create a new ontology, i.e., a version number at the end of the Version IRI:

ontology IRI: http://www.semanticweb.org/vendetti/ontologies/2020/4/untitled-ontology-3
version IRI: http://www.semanticweb.org/vendetti/ontologies/2020/4/untitled-ontology-3/1.0.0

However, when I looked at a few of the OBO Foundry ontologies in BioPortal, the Version IRI is constructed differently where the name of the ontology source file makes up the last bit, e.g.:

ontology IRI: http://purl.obolibrary.org/obo/pato.owl
version IRI: http://purl.obolibrary.org/obo/pato/releases/2020-05-08/pato.owl
graybeal commented 4 years ago

And other ontologies potentially do it other ways. Whatever solution we come up with will have to not make any assumptions about how the versionIRI is constructed—if they want a version label, perhaps 'version' is the right annotation for that.

Note also that OWL 1 ontologies will still exist for quite some time, so we can't apply the OWL 2 rules about what versionInfo means to the OWL 1 ontologies, that would be improper in my humble opinion.

paolaroncaglia commented 2 years ago

Following up on this thread. Are there any updates please on modifying BioPortal so it displays versionIRI? The current situation is not ideal, see e.g. attached screenshot taken from https://bioportal.bioontology.org/ontologies/UBERON:

Screen Shot 2021-09-24 at 17 11 10

Thanks, Paola

jvendetti commented 2 years ago

Hi @paolaroncaglia. I will check with my manager @graybeal about whether this could be prioritized in the near future.

In the mean time, I had posed a question above about version IRI functionality that no one has really answered for me. Section 3.2 of the OWL 2 specification gives an example for versions IRIs where the actual version number is at the end of the IRI:

The ontology document containing the current version of an ontology series might be accessible via the IRI http://www.example.com/my, as well as via the version-specific IRI http://www.example.com/my/2.0. When a new version is created, the ontology document of the previous version should remain accessible via http://www.example.com/my/2.0; the ontology document of the new version, called, say, http://www.example.com/my/3.0, should be made accessible via both http://www.example.com/my and http://www.example.com/my/3.0.

It seems the OBO community chose not to mimic the example given by the W3C, instead putting the file name at the end of the version IRI, e.g., from UBERON:

http://purl.obolibrary.org/obo/uberon/releases/2021-07-27/ext.owl

I had originally envisioned updating BioPortal's source code to simply extract the last part of the version IRI string for display in the Version column. However, it doesn't seem like we can make the assumption that ontology developers are using this convention. Which in turn means that we'd probably have no choice but to display the full string value of the version IRI annotation property. Speaking from an aesthetic viewpoint, I've never like this idea - we'd end up displaying a long IRI string in the Version column, which is really meant to contain shorter numeric version numbers that make it quick and easy for the end user to see what versions are available.

At any rate, comments/ideas from others are welcome.

graybeal commented 2 years ago

There is a parallel discussion happening in OBO at https://github.com/information-artifact-ontology/ontology-metadata/issues/71. That was driven by versioning of axioms but it is all tied together.

We definitely do have to display the full string. (As a practical matter, the order expressed in the W3C notion suffered from a few fundamental flaws, the most noticeable being that >99% of the time the identifier is more important to the user than the version. So it wasn't going to be adopted much.)

but as you can see in the other thread, there are 2 different pieces of version information: owl:versionInfo, and owl:versionIRI. The point made there is that since versionIRI is clearly an IRI, versionInfo seems like the place where a date or other short string could be used for the version. (Ideally they would have added owl:versionLabel, because versionInfo seems like it could contain descriptive information of any length, and that would be icky.) Just to quote from that thread: An owl:versionInfo statement generally has as its object a string giving information about this version, for example RCS/CVS keywords.

balhoff commented 2 years ago

@jvendetti @graybeal we've been making a push to add owl:versionInfo across OBO, although it's up to each ontology project to make the addition. The versionInfo will be the varying part of the version IRI.

graybeal commented 2 years ago

To @jvendetti 's point, we're going to be doing a lot of prioritizing of BioPortal improvements soon, and it's my hope that supporting both versionInfo and versionIRI as visible metadata will be a high priority. Which doesn't give us dates but at least hints at the possible direction.

jonquet commented 2 years ago

I confirm that 2 changes are necessary (I believed) as discussed in this thread:

We should not build a system that suppose/expect a constraint/rule between owl:versionInfo and owl:versionIRI => the fact that the version appears OFTEN as a substring inside the URI of the version is a chance on which a system can't be implemented.

graybeal commented 2 years ago

Agree with @jonquet , but not with depending on versionInfo containing strictly a version number. Unless I'm mistaken other forms may be useful (e.g., 1.0.1 is not a number).