graybeal / ont

org.mmisw.ont
0 stars 0 forks source link

Please set up periodic "harvest" for updated version of GCOOS ontology #178

Closed graybeal closed 9 years ago

graybeal commented 9 years ago

_From steph_wa...@consolidated.net on September 14, 2009 19:27:24_

What capability do you want added or improved? Felimon at GCOOS requests a periodic "harvest" of the updated version of GCOOS ontology for the MMI repository. Harvest from: http://gcoos.rsmas.miami.edu/dp/srv_gcoos_generateOWL.php Where do you want this capability to be accessible? to be automatic What sort of input/command mechanism do you want? What is the desired output (content, format, location)? Other details of your desired capability? What version of the product are you using? Please provide any additional information below (particular ontology/ies, text contents of vocabulary (voc2rdf), operating system, browser/version (Firefox, Safari, Chrome, IE, etc.), screenshot, etc.)

Original issue: http://code.google.com/p/mmisw/issues/detail?id=178

graybeal commented 9 years ago

From caru...@gmail.com on September 14, 2009 19:55:14

Thanks Stephanie for entering this request.

My initial comment is that automatic harvesting hasn't been addressed yet, and I think it will take some significant time to implement (especially given the many other features that still need work for a stable system). But it is something being considered (I'm thinking about the CF and GCMD vocabs in particular, so perhaps it is not too far in the future either.

Here are some questions/comments:

2) I see the following xml:base in the ontology obtained from the given URL): http://gcoos.rsmas.miami.edu/dp/data/Parameters.owl Question: should the corresponding ontology in the MMI ORR keep this original xml:base (in other words, should the MMI ORR "re-host" the ontology), or can the registered ontology be given an ' http://mmisw.org/ont/gcoos'-based URI?

(I'm including Felimon and John in the Cc of this issue, in case they can add their comments.)

Thanks. --carlos

Owner: carueda
Cc: fgayan...@rsmas.miami.edu grayb...@marinemetadata.org
Labels: content

graybeal commented 9 years ago

From felimon....@gmail.com on September 15, 2009 04:19:56

Thanks for keeping me on the loop:

True that automatic harvesting will introduce other issues (e.g. mapping of new inputs; dealing with broken links, etc.) but this needs to addressed if we want the registry to be "current". There are several ways to deal with this and can be categorized as either: (1) Passive - data contributors will actively push their data onto the repository and update what needs to be updated as edits are made to their collection, or (2) Active - the project provides a utility to establish a service (event-driven) to push the data automatically as a change is made and reminders are emailed for other manual operations (if necessary).

Option (1) will very likely not work given the prevailing working environment and available resources; i.e. very few people have time to even read, less edit, the collection. This answers the first query: How often is the vocabulary changing? Answer is seldom to date and I do not foresee a frequency. If a user or data provider chances upon the collection and discovers a problem (no definition, wrong definition, insufficient) the norm is to email to whoever manages the collection. In some cases, they are circulated among regional members and comments are received and consolidated -- then an edit is made to the collection.

The most ideal scenario is to submit the collection to the group for an annual review and amendments are made once a year. Simple but most regional associations (my guess) has yet to get to this rhythm as the need to keep their vocabulary current and 'sufficient' seems not well understood and/or appreciated.

I suspect that once an application of this ontology/repository works, this 'need' will surface -- the utility of this ontology has not reach a level that groups will give importance to their collections.

Simply, I think, providing data providers with ready tools to do most of the job for them (i.e. automated harvesting) will be most attractive and can be sustainable in the long term.

With regards to the query on the xml:base, I suggest that MMI ORR keep the base as provided by the data provider.

graybeal commented 9 years ago

From grayb...@marinemetadata.org on September 15, 2009 22:18:41

I concur with Felimon that we need to provide the facility -- the need will become obvious as people start really using the system. I think we'll want two modes for option 2: event-based and polled by schedule. Event-based will kick off based on some explicit notification from the target system, perhaps a URL-formed command that says 'update this ontology' (with the one at the base location which has been previously defined).

Polled will be needed for people who can't, or aren't inclined to, send the repository a message. In this mode content at the source will be compared to content internally, in some form TBD. Changes drive a new update. Polling cycles will need to be set separately for different ontologies; someday the frequency can be modified automatically to reflect recent update rates, but probably should never go below about daily.

Regarding xml:base, I think it depends on what the provider is trying to achieve. (Some just inherit a base but don't really care about keeping it.) But we certainly need to be prepared to keep the original base, as an option. (That is essentially equivalent to indexing an ontology.)

The obvious disadvantage to this "keep the base" approach is that it won't help the ontology term URIs be resolved, because the base isn't in MMI's domain namespace. Our preference would therefore be for people to use our base, except when there is a clear reason not to. (Any clear reason will do, we don't want to pass judgment or anything.)

graybeal commented 9 years ago

From caru...@gmail.com on April 04, 2010 18:01:16

Direct registration of ontologies (and their new versions) was implemented. (This work was done especially in the context of the OOI Semantic Prototype).

Please see: http://ci.oceanobservatories.org/spaces/display/CIDev/Direct+registration+of+RDF+contents https://code.google.com/p/mmisw/source/browse/#svn/trunk/mmiorr-client-demo This client demonstrates how to programatically perform registration (and retrieval) operations against the ORR. See the README file there.

Although this does not provide, per se, any automated mechanism for updated versions, it certainly makes it feasible and relatively straightforward.

Your comments are very welcome (again) now. I think I'd like to close this issue (as fixed) given the direct registration capability mentioned above (and with the corresponding clarification); but, since this issue is essentially about allowing "periodic, automated version updates" I'd like to have your input. Perhaps we could still close this particular entry and create others for the more concrete features mentioned in this thread. Also, please keep in mind that we are currently prioritizing entries for a first beta release ("milestone-Beta1"). Please also suggest on whether this (if not closed) or which of the derived entries should marked with that label.

Thanks and regards. --carlos

Status: Started
Labels: ooici

graybeal commented 9 years ago

From felimon....@gmail.com on April 05, 2010 06:58:07

I think you can consider this issue fixed. I agree that automating it can be handled (if needed) via other means. Thanks for addressing the issue. --- Nonong

Status: Fixed

graybeal commented 9 years ago

From caru...@gmail.com on April 05, 2010 09:57:34

Thanks.

Labels: Milestone-Beta1