OntoZoo / ontobee

Ontobee is a linked data server for ontologies. See: http://www.ontobee.org.
27 stars 5 forks source link

When are new versions of ontologies pulled into ontobee? (+) #64

Open alanruttenberg opened 8 years ago

alanruttenberg commented 8 years ago

We have a new release of NCRO, as usuall at http://purl.obolibrary.org/obo/ncro.owl However, it is a merged version and I'd rather ontobee use http://purl.obolibrary.org/obo/ncro/prebuild/ncro.owl

How does ontobee know to reload an ontology? Does it check for updates regularly or should an issue be submitted each time we want a refresh.

http://purl.obolibrary.org/obo/ncro/prebuild/ncro.owl

e4ong1031 commented 8 years ago

Ontobee will begin weekly OBO update on Fridays starting Dec 18.

Regarding which OWL download link to be used, which version do you want to upload to Ontobee, the merged version or only terms created in NCRO? Currently Ontobee will automatically merge ontology before uploading to the database.

cmungall commented 8 years ago

I think Alan's point is that this shouldn't be something arranged via email on a per-ontology basis. It should be transparent.

You are welcome to suggest something that could be added to the ontologies.yml file. This has the advantage that groups could directly edit this when they edit their ontology .md file on obofoundry.org

On 11 Dec 2015, at 11:23, e4ong1031 wrote:

Ontobee will begin weekly OBO update on Fridays starting Dec 18.

Regarding which OWL download link to be used, which version do you want to upload to Ontobee, the merged version or only terms created in NCRO? Currently Ontobee will automatically merge ontology before uploading to the database.


Reply to this email directly or view it on GitHub: https://github.com/OntoZoo/ontobee/issues/64#issuecomment-163980334

cmungall commented 8 years ago

See also https://github.com/OBOFoundry/OBOFoundry.github.io/issues/16

linikujp commented 8 years ago

I think Edison mentioned the automated update will start on Friday 18th.

On Fri, Dec 11, 2015 at 2:52 PM, Chris Mungall notifications@github.com wrote:

See also OBOFoundry/OBOFoundry.github.io#16 https://github.com/OBOFoundry/OBOFoundry.github.io/issues/16

— Reply to this email directly or view it on GitHub https://github.com/OntoZoo/ontobee/issues/64#issuecomment-164033198.

linikujp commented 8 years ago

Sorry, Edison didn't mention it clearly in his email. I confirmed with Oliver that the functionality is supposed to realize the automated update.

On Fri, Dec 11, 2015 at 3:10 PM, Asiyah Yu Lin linikujp@gmail.com wrote:

I think Edison mentioned the automated update will start on Friday 18th.

On Fri, Dec 11, 2015 at 2:52 PM, Chris Mungall notifications@github.com wrote:

See also OBOFoundry/OBOFoundry.github.io#16 https://github.com/OBOFoundry/OBOFoundry.github.io/issues/16

— Reply to this email directly or view it on GitHub https://github.com/OntoZoo/ontobee/issues/64#issuecomment-164033198.

e4ong1031 commented 8 years ago

@linikujp Thanks for pointing out the confusion.

@cmungall The update process is automated so that it will read information from OBO registry(we use JSON instead of YAML). I agree with you that ontology-specific uploading policy can be defined in the metadata. We can definitely set up the format.

@alanruttenberg May be you can define something like, - ontobee: - {download: "http://purl.obolibrary.org/obo/ncro/prebuild/ncro.owl'}, in the OBOFoundry YAML metadata? But Ontobee will merge the ontology before loading into the database. It is probably fine to use http://purl.obolibrary.org/obo/ncro.owl.

alanruttenberg commented 8 years ago

I guess the question is what the merge means. I suppose I don't need to keep the imports distinct. For background, a couple of issues.

1) When I merge I currently remove ontology and ontology annotations from the imported files. I was going to check that you do the same but http://www.ontobee.org/ontology/OBI is getting server error right now.

2) I also have in mind future improvements to Ontobee's report of reuse of ontologies. Right now if we do MIREOT, each term gets an 'imported from' annotation. However if we import, e.g. BFO as a whole, those terms are not marked as 'imported from'. They probably should be. I suppose I/we could do that when we merge. Maybe it can be added as part of the automatic pipeline.

You can't use the prefix as a reliable source of information about the source ontology because terms can move to be managed by a different ontology group without changing ID in order to avoid churn.

Once this is sorted out, the statistics would report, e.g. how many terms managed by the source ontology (whatever their prefix is), which ontologies were imported (in whole or in part), as well as counts of the imported terms.

FWIW, it's a bit more complicated, but you might consider a more frequent update policy. For sites that include an etag or last-modified header you can skip any that have the same etag or haven't changed since the previous time. For those that don't have those headers you could keep an MD5 of the ontology files, and on daily download check if the MD5 has changed, aborting processing if it hasn't.

alanruttenberg commented 8 years ago

Another thought is that it may make sense to have more than one product/version available on ontobee. These could be chosen from the ontology front page. An example would be having the live dev version available in ontobee, as well as the last released version. Preferences could also be expressed in the YAML.

linikujp commented 8 years ago

@alanruttenberg About the 2) issue. I think there is a design issue with regard to OntoFox, which is not in compliance with the OBO foundry principle (recently updated) of reference the original term. Because when OntoFox import term from OBI, if this term is an imported term imported by OBI, for example, the IAO term in OBI, the OntoFox will create another statement of import from.., so in the final product, one will see the term has two imported from annotation. This we should avoid.

Anyway, I think this is a complicated issue. It deserves some discussions and thinking.

I suggest Edison think about it when he do a second release of OntoBee or OntoFox.

linikujp commented 8 years ago

Multiple versions available on OntoBee is out of scope of OntoBee as my assessment. I think OntoBee suppose to visualize the most updated version.

Maybe Oliver needs to make a decision on this.

alanruttenberg commented 8 years ago

@linikujp are you sure you see two 'imported from' annotations? The RDF spec doesn't allow for more than one triple that is the same. You can sometimes get more than one triple from a SPARQL query if you are querying from several graphs and the statement is in more than one graph. A select DISTINCT can usually get rid of that.

A more serious issue is that users browsing more than one ontology may see different definitions for the same term, because different importing ontologies might have used different versions. So practically speaking Ontobee is handling versions, but in a non-transparent way.

There are are a few things that can be done. I'm in favor of all terms shown in Ontobee being their latest version, unless one chooses otherwise. If the functionality that is there now wants to be retained in some form, check if the older version has a different axiom set than the current version and if so offer a button to see the version that the importing version at the time.

linikujp commented 8 years ago

@alanruttenberg Please see the example here: image

Correction: my example shows that if a term is imported from two different importing files, as shown in above case, the BFO term was imported into both OBI and OGG.

alanruttenberg commented 8 years ago

@linikujp ok. Well, I've explained how this can happen and how to avoid it in my previous comment. @e4ong1031 let me know if you have any questions about that.

linikujp commented 8 years ago

@alanruttenberg Do you mean it happens because: "You can't use the prefix as a reliable source of information about the source ontology because terms can move to be managed by a different ontology group without changing ID in order to avoid churn." ?

linikujp commented 8 years ago

Another example:

image

alanruttenberg commented 8 years ago

I believed you the first time ;-) Same solution.

alanruttenberg commented 8 years ago

Ayee - just realized that it would be better to have ontofox generate the imported-from to the dated-purls, if the model is that it is presenting a specific version of the imported terms. I suppose I'll add this to the Ontofox issue list