Closed GoogleCodeExporter closed 9 years ago
Thanks Stephanie for entering this request.
My initial comment is that automatic harvesting hasn't been addressed yet, and I
think it will take some significant time to implement (especially given the many
other features that still need work for a stable system). But it is something
being
considered (I'm thinking about the CF and GCMD vocabs in particular, so perhaps
it is
not too far in the future either.
Here are some questions/comments:
- How often is the GCOOS vocabulary changing? An initial approach would be to
implement some convenient mechanism to create a new version of the GCOOS vocab
at ORR
by just giving the source URL (like the one above). Then Felimon can just
notify us
that a new version has been posted so one of us can run the said mechanism
(hopefully
a task to complete in just a couple of mins)..
2) I see the following xml:base in the ontology obtained from the given URL):
http://gcoos.rsmas.miami.edu/dp/data/Parameters.owl
Question: should the corresponding ontology in the MMI ORR keep this original
xml:base (in other words, should the MMI ORR "re-host" the ontology), or can the
registered ontology be given an 'http://mmisw.org/ont/gcoos'-based URI?
(I'm including Felimon and John in the Cc of this issue, in case they can add
their
comments.)
Thanks. --carlos
Original comment by caru...@gmail.com
on 15 Sep 2009 at 2:55
Thanks for keeping me on the loop:
True that automatic harvesting will introduce other issues (e.g. mapping of new
inputs; dealing with broken links, etc.) but this needs to addressed if we want
the
registry to be "current". There are several ways to deal with this and can be
categorized as either: (1) Passive - data contributors will actively push their
data
onto the repository and update what needs to be updated as edits are made to
their
collection, or (2) Active - the project provides a utility to establish a
service
(event-driven) to push the data automatically as a change is made and reminders
are
emailed for other manual operations (if necessary).
Option (1) will very likely not work given the prevailing working environment
and
available resources; i.e. very few people have time to even read, less edit, the
collection. This answers the first query: How often is the vocabulary changing?
Answer is seldom to date and I do not foresee a frequency. If a user or data
provider
chances upon the collection and discovers a problem (no definition, wrong
definition,
insufficient) the norm is to email to whoever manages the collection. In some
cases,
they are circulated among regional members and comments are received and
consolidated
-- then an edit is made to the collection.
The most ideal scenario is to submit the collection to the group for an annual
review and amendments are made once a year. Simple but most regional
associations (my
guess) has yet to get to this rhythm as the need to keep their vocabulary
current and
'sufficient' seems not well understood and/or appreciated.
I suspect that once an application of this ontology/repository works, this
'need'
will surface -- the utility of this ontology has not reach a level that groups
will
give importance to their collections.
Simply, I think, providing data providers with ready tools to do most of the
job for
them (i.e. automated harvesting) will be most attractive and can be sustainable
in
the long term.
With regards to the query on the xml:base, I suggest that MMI ORR keep the base
as
provided by the data provider.
Original comment by felimon....@gmail.com
on 15 Sep 2009 at 11:19
I concur with Felimon that we need to provide the facility -- the need will
become obvious as people start
really using the system. I think we'll want two modes for option 2: event-based
and polled by schedule.
Event-based will kick off based on some explicit notification from the target
system, perhaps a URL-formed
command that says 'update this ontology' (with the one at the base location
which has been previously
defined).
Polled will be needed for people who can't, or aren't inclined to, send the
repository a message. In this mode
content at the source will be compared to content internally, in some form TBD.
Changes drive a new update.
Polling cycles will need to be set separately for different ontologies; someday
the frequency can be modified
automatically to reflect recent update rates, but probably should never go
below about daily.
Regarding xml:base, I think it depends on what the provider is trying to
achieve. (Some just inherit a base but
don't really care about keeping it.) But we certainly need to be prepared to
keep the original base, as an
option. (That is essentially equivalent to indexing an ontology.)
The obvious disadvantage to this "keep the base" approach is that it won't help
the ontology term URIs be
resolved, because the base isn't in MMI's domain namespace. Our preference
would therefore be for people to
use our base, except when there is a clear reason not to. (Any clear reason
will do, we don't want to pass
judgment or anything.)
Original comment by grayb...@marinemetadata.org
on 16 Sep 2009 at 5:18
Direct registration of ontologies (and their new versions) was implemented.
(This
work was done especially in the context of the OOI Semantic Prototype).
Please see:
http://ci.oceanobservatories.org/spaces/display/CIDev/Direct+registration+of+RDF
+contents
http://code.google.com/p/mmisw/source/browse/#svn/trunk/mmiorr-client-demo
This client demonstrates how to programatically perform registration (and
retrieval)
operations against the ORR. See the README file there.
Although this does not provide, per se, any automated mechanism for updated
versions,
it certainly makes it feasible and relatively straightforward.
Your comments are very welcome (again) now. I think I'd like to close this
issue (as
fixed) given the direct registration capability mentioned above (and with the
corresponding clarification); but, since this issue is essentially about
allowing
"periodic, automated version updates" I'd like to have your input. Perhaps we
could
still close this particular entry and create others for the more concrete
features
mentioned in this thread. Also, please keep in mind that we are currently
prioritizing entries for a first beta release ("milestone-Beta1"). Please also
suggest on whether this (if not closed) or which of the derived entries should
marked
with that label.
Thanks and regards. --carlos
Original comment by caru...@gmail.com
on 5 Apr 2010 at 1:01
I think you can consider this issue fixed. I agree that automating it can be
handled
(if needed) via other means. Thanks for addressing the issue. --- Nonong
Original comment by felimon....@gmail.com
on 5 Apr 2010 at 1:58
Thanks.
Original comment by caru...@gmail.com
on 5 Apr 2010 at 4:57
Original issue reported on code.google.com by
steph_wa...@consolidated.net
on 15 Sep 2009 at 2:27