ruby-rdf / rdf

RDF.rb is a pure-Ruby library for working with Resource Description Framework (RDF) data.
http://rubygems.org/gems/rdf
The Unlicense
382 stars 92 forks source link

docmentation: how to create new vocabs, and langage-tagged strings in vocabs #432

Closed wu-lee closed 2 years ago

wu-lee commented 2 years ago

The documentation doesn't really explain how to create an RDF::Vocab. I infer from the source code there's a sort of DSL, and you can do it something like this:

require 'rdf'
require 'rdf/vocab'

base_uri = 'http://example.com/foo#'

vocab = Class.new(RDF::Vocabulary(base_uri)) do
  ontology(
    base_uri,
    type: 'skos:ConceptScheme',
    title: 'blah'
    'dc:creator': "J Bloggs',
    'dc:modified': Date.today,
    'dc:description': 'blah blah',
    'dc:publisher': 'http://example.com/',   
  )

  term :whatever, inScheme: 'some:pname'

  # ...
end

However, there doesn't seem to be a very easy way to language-tag strings. Do I need to create an RDF::Literal for each tag? This is what I was hoping for, but it doesn't seem to work:

    'dc:title': {de: "Länder",
                   en: "Countries",
                   es: "Países",
                   fr: "Des Pays",
                   ko: "국가",
                   pt: "",
                   zh: "国别"},
gkellogg commented 2 years ago

You're right that the DSL doesn't really handle language-tagged literals. The point of the DSL, though, is to create sub-classes of RDF::Vocabulary (and the DSL is essentially just using the class-methods from RDF::Vocabulary and RDF::Vocabulary::Term). It's most easily done using RDF::Vocabulary::Writer from a graph, which is available through the rdf CLI. For example, from the rdf-vocab Rakefile

cmd = "bundle exec rdf"
if v[:patch]
  File.open("lib/rdf/vocab/#{id}.rb_p", "w") {|f| f.write v[:patch]}
  cmd += " patch --patch-file lib/rdf/vocab/#{id}.rb_p"
end
cmd += " serialize --uri '#{v[:uri]}' --output-format vocabulary --ordered"
cmd += " --module-name #{v.fetch(:module_name, "RDF::Vocab")}"
cmd += " --class-name #{v[:class_name] ? v[:class_name] : id.to_s.upcase}"
cmd += " --strict" if v.fetch(:strict, true)
cmd += " --noDoc"
cmd += " --extra #{URI.encode_www_form_component v[:extra].to_json}" if v[:extra]
cmd += " -o lib/rdf/vocab/#{id}.rb_t"
cmd += " '" + v.fetch(:source, v[:uri]) + "'"
puts "  #{cmd}"

The DSL for, say .property, instantiates a new RDF::Vocabulary::Term, where there are some heuristics to turn option entries into IRI keys and values. String or Symbol values that look like IRIs are used to create an RDF::URI, otherwise, they are either Date, DateTime, Integer, Decimal, Double, Boolean, or Literal. There isn't anything specific to create a language-tagged literal, but of course, the property option values could be instances of RDF::Literal, instead.

Given the way RDF::Vocabulary::Term#attribute_value is implemented, we could interpret an Hash value to be something like a language map. But, that would complicate many other places which use vocabularies and expect simple literal values for accessors such as #comment and #label.

In summary, there may be some ripple effects to things which use vocabularies, but something like you suggest is probably feasible, but ambitious.

gkellogg commented 2 years ago

@wu-lee I added support to this to the develop branch, and will also update the rdf-vocab representation. As this is potentially disruptive (but shouldn't be), I'll give the public-rdf-ruby@w3.org mailing list an opportunity to object before releasing to RubyGems. If that happens, it will end up falling back to an option, at lest until the next dot release. It does affect many vocabularies in rdf-vocab.

wu-lee commented 2 years ago

Thank you for this. I'll investigate when I get a moment.