Closed escowles closed 9 years ago
I wasn't sure what to do with the udfrs:GenreFacetType instances -- these look like OWL named individuals, and I don't think those map directly to RDFS. Is it OK to leave them as is?
@escowles thanks for updating the stylesheet too!
Looks like udfrs:GenreFacetType is an owl class: http://udfr.org/onto/onto.rdf
<owl:Class rdf:ID="GenreFacetType">
<rdfs:subClassOf rdf:resource="#ControlledVocabulary"/>
<rdfs:isDefinedBy rdf:resource="#"/>
<rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Genre facet type</rdfs:label>
<dc:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
The genre facet type defines the main classes found in the GDFR classification system. It is intended to indicate broadly the type of content associated with a format.
</dc:description>
</owl:Class>
...and http://udfr.org/docs/onto/
@ruebot: Right, udfrs:GenreFacetType is a class -- and the terms here are all instances of it. So I think that makes them NamedIndividuals in OWL. This wasn't explicit before, but they could have been:
<owl:NamedIndividual rdf:about="http://pcdm.org/file-format-types#Archive">
<rdf:type rdf:resource="http://www.udfr.org/onto#GenreFacetType"/>
...
</owl:NamedIndividual>
So, should we leave them as individuals? Or should they be classes that subclass udfrs:GenreFacetType? Maybe @azaroth42 or @acoburn have opinions?
@escowles oof. I totally misinterpreted that :grimacing:
As currently defined, these are definitely individuals. For instance, see the OWL guide. @escowles is correct in his example above.
However, when I look at other similar vocabularies (e.g. DCMIType), theses sorts of entities are defined as classes. So I'm somewhat inclined to follow that pattern (though there may be another pattern suggesting otherwise).
In terms of using this vocabulary (as it currently stands), am I correct that one might express this:
<my-resource> a pcdm:File, pcdmuse:OriginalFile ;
dc:type pcdmformat:Dataset .
as opposed to this:
<my-resource> a pcdm:File, pcdmuse:OriginalFile, pcdmformat:Dataset .
My understanding of the Class
vs. Individual
distinction is that an Individual
is one particular thing. For example, lit:JohnMilton
or planet:Jupiter
, as opposed to a "generic type": lit:Author
or planet:Jovian
. And so it would follow that e.g. a Dataset
is a type of thing (i.e. a rdfs:Class
) rather than a particular thing (owl:NamedIndividual
). However, one could also argue that a Dataset
is a particular GenreFormat (and hence an owl:NamedIndividual
rather than an rdfs:Class
).
That is to say, I could go either way but am inclined toward defining them as classes because it seems other similar vocabs do that. Do @azaroth42 or @barmintor have an opinion?
I just checked the other vocabs, and DCMIType, MARC Resources, Nepomuk and Pronom define their terms as classes, and AAT and UDFRS define them as individuals.
I agree with Aaron: I could go either way, but these terms to seem more like categories, so maybe converting them to rdfs:Classes makes more sense.
I've updated this PR to make the entities RDFS Classes instead of named individuals.
:+1:
:+1: (non-binding)
I'm not sure about this. If you get into the notion/category debate, I think you're getting into an overly expansive ontology of class. I'd ask, for example, whether you expect these Things to be the object of rdf:type, or of (for example) dc:format. EDIT: I see @acoburn is ahead of me!
That said, I'm very interested to see what Rob's opinion is.
@barmintor I was definitely thinking these would be used with dc:format. Does that argue against defining them as classes? The DMCIType terms are defined as classes, which seem like the canonical terms to use with dc:format: http://dublincore.org/2012/06/14/dctype.rdf
FWIW, I was also planning to use these with dc:format
.
I'm not digging in my heels, I only want to make sure we're not conflating semantic contexts here. If we're going to follow the DC practice here, we should probably remove the "<rdfs:subClassOf rdf:resource=\"http://www.udfr.org/onto#GenreFacetType\" />" statements.
pings @azaroth42
:-1: to both making them classes and using dc:format.
If the pattern is:
_:x a pcdm-ext:Archive ;
Then I'm okay with a class. But having classes as the object of dc:format seems very strange. What would the instances of the class be?
@azaroth42 I am reading this as "-1 to making them classes while using dc:format", and not "-1 to classes; -1 to dc:format". Is that correct?
Yes...
:-1: to ?x dc:format ?y . ?y a rdfs:Class .
But I'm fine with either ?x a ?y .
or ?x dc[terms]:format ?y .
Happy to hear arguments as to why it should be a class though?
@azaroth42: I think instances of the classes would be fully-specified file formats (e.g. TIFF 6.0 would be an instance of #RasterImage). I think the vocabs were referencing here are split between whether their terms are classes or individuals, though DC/DCMIType definitely envisions using dc:format with DCMIType classes.
Some worked examples might help?
@azaroth42: I would expect the typical use to be something like:
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ebucore: <http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#> .
@prefix ldp: <http://www.w3.org/ns/ldp#> .
@prefix pcdm: <http://pcdm.org/models#> .
@prefix pcdmfmt: <http://pcdm.org#file-format-type#> .
@prefix pcdmuse: <http://pcdm.org/use#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema> .
</object1/files/file1> a pcdm:File, ldp:NonRDFSource, pcdmuse:ServiceFile;
dc:format pcdmfmt:Video;
ebucore:fileSize "12345678"^^xsd:long;
ebucore:filename "movie.mp4" .
</object1/files/file2> a pcdm:File, ldp:NonRDFSource, pcdmuse:ThumbnailImage;
dc:format pcdmfmt:RasterImage;
ebucore:fileSize "5678"^^xsd:long;
ebucore:filename "thumbnail.jpg" .
</object1/files/file3> a pcdm:File, ldp:NonRDFSource, pcdmuse:ExtractedText;
dc:format pcdmfmt:UnstructuredText;
ebucore:fileSize "1234"^^xsd:long;
ebucore:filename "fulltext.txt" .
</object1/files/file4> a pcdm:File, ldp:NonRDFSource, pcdmuse:Transcript;
dc:format pcdmfmt:HTML;
ebucore:fileSize "1234"^^xsd:long;
ebucore:filename "transcript.html" .
But you could also define individuals if you wanted to record a specific format for some reason:
</object1/files/file5> a pcdm:File, ldp:NonRDFSource, pcdmuse:OriginalFile;
dc:format pcdmfmt:Video, </formats/VideoFormat1>;
ebucore:fileSize "123456789"^^xsd:long;
ebucore:filename "movie.vid" .
</formats/VideoFormat1> a pcdmfmt:Video;
dc:title "Video Format #1" .
But having classes as the object of dc:format seems very strange.
Yes.
If I'm understanding the concern correctly, I think it comes down to: "This thing is a this format" vs. "This thing is of this format", which is a pretty subtle distinction.
To me, an instance of a postcard is not an instance of a format. It might be a resource that has characteristics in common with other things that are also of this dc:format
, but, from a practical perspective, you can only use that fact to contextualize it among other resources (based on their dc:format
s) or maybe trigger certain behaviors in an application. You can't use the object of dct:format
to constrain, e.g., the rdfs:range
or rdfs:domain
of a resource, so what does making it a class get us? If anything, by not making the object of dct:format
a class, the distinction between rdf:type
and our intentions for dct:format
becomes clearer.
Thanks for the example @escowles! I'm still :-1: to using both classes and instances of those classes as the object of dc:format. The video File doesn't have a format which is the class of video formats ... it has a particular format. The video File (OTOH) is a Video. Having a class for use (which is context specific) but a mix of class and instance for format (which is not context specific) doesn't fill me with happiness.
I agree with @jpstroop: The video file is-a Video. It is-in-a/has-a format, which is-a Format.
I think I understand the reasoning for using individuals instead of classes here: dc:format should point to a concrete instance not a class, and using classes will lead to confusion with the File Use Vocab.
I'm happy to revert to using udfrs:GenreFacetType instead of rdfs:Class. But the existing rdfs:subClassOf statements should probably be changed to something else: skos:broader makes the most sense to me, given the skos:exactMatch/skos:closeMatch we're already using.
@escowles :+1: -- I like the use of skos:broader
I'm gonna tag some new committers to see if we can get some movement on this:
@daniel-dgi @no-reply @kestlund
@escowles @ruebot Catching up on this discussion... :+1: to 'skos:broader' ; I had been indifferent to 'udfrs:GenreFacetType' but if it resolves the arguments, then it certainly is worth keeping.
Are there any other outstanding issues or just looking for additional consensus?
:+1: for skos semantic relations, but also i think it would be correct to define that every instance is also of rdf:type -> skos:Concept Since skos:broader, exact match, etc work on skos:Concept individuals.
Lastly, just a functional idea (wish), it could be useful to add an owl:imports for udfrs. It's a practical need when using pcdm ontologies in applications like protege. No need to import skos because udfrs already does this.
@kestlund I think we're just making sure we've got consensus here.
@DiegoPino: I agree it would be good to define the terms as skos:Concepts, since we're using the SKOS predicates to link them. I'm not sure about importing UDFRS -- is that just for the udfrs:GenreFacetType definition?
@escowlesthe idea of importing is just functional. We are creating individuals from an external ontology defined classes. So i thought it may be a good idea to import them, but don't worry, just a wish based on one of my personal use case and maybe out off scope (so no intention to add this topics to this particular conversation):
Personal use case:
I have been trying to deal and understand the strange/modal (strange for me, i'm sure there is a need, but i'm not aware clearly) mix in PCDM of rdfs
and owl
worlds and doing some local research using Protege to see how well all those different ontologies + ldp
+ PCDM
play together. I have seen some comments here in the issues post about owl being a complicated beast to handle but i still see some parts of owl are being used(thats the modal part), and being my own experience the opposite( like the beautiful idea of having ObjectProperty
and DatatypeProperty
as different properties) and also not fully understanding how jumping from rdf
to owl
affects this, i usually pass PCDM
ontologies through Protege. So said that, without imports it makes testing very complicated.
I've added rdf:type statements to the terms to make them skos:Concepts, and rebased to squash and resolve conflicts with the updated rdfs2html stylesheet.
@DiegoPino: I haven't included an owl:imports declaration, since that seems like a separate issue to me. Can you create another ticket for that? It seems like there is a broader discussion of OWL/RDFS, compatibility with tools, etc. that we should have.
@escowles: thanks, don't worry about the owl:imports, it's just a good practice if creating new individuals from external defined classes. But I will create a new ticket for that because i'm having some issues dealing with this strange (strange for me…long discussion) mix of owl and rdfs use when trying to validate and do some interoperation with PCDM + LDP in protege
@duraspace/pdcm-committers shall we review/vote on this again since we new commits from @escowles?
@duraspace/pdcm-committers bump :sweat:
+1
:ok_hand: ... This isn't how I would do it, but as I'm not doing it and it's not core, I have no technical objections.
FWIW, the approach that I have seen taken most often is to use classes and rdf:type, such as:
:+1:
Fixes #26