Closed mjgiarlo closed 9 years ago
@mjgiarlo Thinking about whether and how contributed vocabs are "validated" for accuracy, completeness, etc.
@dchandekstark Any chance that a contributed vocab could be validated against an OWL representation? Also, should vocab properties contain type:, subClassOf:, domain:, range:, etc, attributes where available?
@acoburn sounds like you two should be on the Hydra RDF Working Group! ;) Keep an eye out for details about our next call, post-OR2014.
@acoburn I'm sure others know more about validation issues specific to this domain than I. As a user of the library I would just like to have some assurance that the code is a valid representation of the vocabulary. Ideally these validations would be built into a test suite in some fashion.
The better approach, i think, is the one employed by rdf.rb. Rather than validating, simply generate the vocabs from OWL. It would be a good idea to get more property/class attributes into the vocabularies. I think the Vocabulary class should be extensible, but I would have to check.
On Fri, Jun 6, 2014 at 2:31 PM, Michael J. Giarlo notifications@github.com wrote:
@acoburn https://github.com/acoburn sounds like you two should be on the Hydra RDF Working Group! ;) Keep an eye out for details about our next call, post-OR2014.
— Reply to this email directly or view it on GitHub https://github.com/projecthydra-labs/rdf-vocab/issues/7#issuecomment-45386426 .
I agree, but using vocab-fetch from rdf.rb only ever gives me empty class definitions with MADS, MODS and the other LoC ontologies that I have tried.
@no-reply I echo @acoburn on that. Maybe you could say more?
I take that back about MADS and MODS. Using the vocab-fetch script from rdf.rb, I was able to generate correct vocab classes for both MADS and MODS (once I got the namespace values sorted out).
$ ruby vocab-fetch --uri http://www.loc.gov/mads/rdf/v1# --source http://www.loc.gov/standards/mads/rdf/v1.rdf --class-name MADS
and
$ ruby vocab-fetch --uri http://www.loc.gov/mods/rdf/v1# --source http://www.loc.gov/mods/modsrdf/v1/modsrdf.owl --class-name MODS
A related design question would be: would rdf-vocab be a collection of actual ruby RDF::Vocabulary
classes (as it is now) or, instead, a collection of OWL ontologies that generate vocabularies when the user installs the package?
Or, to put this another way, if it is so easy to generate vocabularies from source OWL files, couldn't this process become part of an AF/Hydra generator? Or simply part of some good documentation?
@acoburn I'm not sure that every vocab we might want is currently available as an OWL ontology, but I suppose for those that are, the vocab classes wouldn't have to be pre-generated.
The more I think about it, the more I like the idea of not pre-generating vocabs for which there are OWL docs. Instead we could have a rake task or tasks and config file of URIs and sources, etc. If y'all are on board with this general direction, maybe we call the validation issue resolved for the time being and move on to fleshing out the functional details of vocab generation?
+.5. I think it might make sense to cache the ontologies/RDFS as fixtures and have the rake task build the classes from those--this way we're not dependent on every site that hosts a a vocab being up all the time.
Not sure if this addresses your concern, Jon, but I was imagining that the generated vocabs would be stored somewhere, either in the installed gem or the target app. So once the initial generation happens, there would be no dependency on remote host uptime. OTOH I'm certainly not invested in this approach, more interested in a general consensus at this point.
Jon, if we did that caching, how about also checking the remote sites, if available, for comparison?
I tend to think we should cache the source vocab when we can (when could we not? Situations where it's too big to be practical?), but it probably does make sense to have the class generated when the gem is installed...this way the output of the class creation can be in line with the version of rdf.rb that the application is using (right? So long as the API for fetching vocabs doesn't change, I guess).
-Js
Sent via mobile. Please excuse typos, brevity, etc.
You could store the OWL file directly as you would an asset. If generating vocabs is trivial enough, why not do it "on the fly" ? Querying the vocab would load the OWL file each time and creating new vocabularies would be a simple matter of loading a new OWL doc.
I'm more in favor of generating the Ruby code. As OWLs change, you can re-generate the class, commit the Ruby code, and release a new version of the gem. If you're concerned about staying in step with the current OWL, you could keep the file's checksum in the class and use it to check it against the online version. A superclass with checksum and version methods would help keep track of that.
I do see the advantage of caching the OWLs, but why not take the additional step of storing them such that they're available to the user.
There are a number of good ideas in this thread. We have wandered away from the validation question strictly speaking, but that's OK. I think we need a live conversion to pull it all together into a plan. Not sure if the next Hydra RDF WG call is appropriate. If not, maybe a Google hangout or something.
IMO, the next Hydra RDG WG call is appropriate. Pinging @no-reply .
I think we've basically solved the "validation" question by relying on ruby-rdf's vocab-loader and authoritative source documents. I'm going to close this issue and open another one about storing source files.
@dchandekstark should clarify this question.