ResearchObject / ro-crate

Research Object Crate
https://w3id.org/ro/crate/
Apache License 2.0
79 stars 34 forks source link

Need to clarify what should be in profile crate for additional terms #258

Closed ptsefton closed 1 month ago

ptsefton commented 1 year ago

Describe the bug

The profile crate does not work the way I was expecting (may not be a bug, but I want to clarify this)

in the crate-focus section additional schema.org vocab used the Schema.org convention of using rdfs:Class (rather than schema:Class) .

 {
      "@id": "txc:Annotation",
      "@type": "rdfs:Class",
      "name": "Annotation",
      "sameAs": "http://www.language-archives.org/REC/type-20020628.html#annotation",
      "rdfs:comment": "The resource includes information which annotates some other linguistic record.",
      "rdfs:label": "Annotation",
      "rdfs:subClassOf": {
        "@id": "schema:CreativeWork"
      }
    },

In the profile @stain you have done a few things I was not expecting:

URL Profile RO-Crate Metadata Document

Suggested fix

I would suggest that:

Additional context Add any other context about the problem here.

stain commented 1 year ago

The reason for adding DefinedTerm to the @type array was to unify a bit with the ad-hoc classes as well as how terms are imported (which may also be used beyond Class/Property, e.g. for roles). The idea being that any term imported is a DefinedTerm but could also be something else, in which case it should (at least) have the name from the Schema.org type.

The move to then use schema.org's types Class and Property which mirrors rdfs:Class and rdfs:Property is perhaps more controversial as these are not commonly used for defining ontologies - as you point out not even used by Schema.org's own definitions which are RDFS only. I also don't know of any tooling that support these. Perhaps they are sensible when only "quoting" such a class or property that is defined elsewhere, but not when defining classes/properties only directly within the profile.

In FAIR-IMPACT we have proposed to evolve the RO-Crate Profile to also work for any Semantic Artefact gathering - in which case it would IMHO stay in soft schema.org land because the artefacts would be fully defined in various ontology languages (e.g. SKOS, OWL2) and imposing rdfs:* would then be too strong. @dgarijo may have different views.

So are you @ptsefton then proposing to change/remove the new text in https://www.researchobject.org/ro-crate/1.2-DRAFT/appendix/jsonld.html#add-local-definitions-of-ad-hoc-terms (which suggests rdfs:* as an optional ad-on) from #232 to revert back to requiring rdfs:* as in RO-Crate 1.1?

ptsefton commented 1 year ago

@stain -- I'm not suggesting anything, still trying to get my head around this issue. At the moment we recommend additional Classes and Properties are use /rdfs?/ prefix and this is coded into the Javascript library and used for looking up term definitions when they are present in the crate.. We are also maintaining a schema (vocabulary) for a major project using the Schema.org Style Schema method (SOSS) where the terms are rdfs:Class, rdf:Property and (schema:)DefinedTerm and (schema):DefinedTermSet. If we change to using (schema:)Class in profiles then I'd want to go back and change all that (and still support the use of /rdfs?:/)

I understand the idea of DefinedTerm but when we know something is a class or a property, isn't it better to bring it into the schema.org world by defining it in a way that tools CAN operate on it as part of a class hierarchy -- we are starting to build more of these in our work.

stain commented 1 year ago

Then I think some middle ground is to encourage again use of rdfs:Class and rdfs:Property when defining them there-and-then in a crate (allowing general rdfs tooling to work, class hierarchies etc).

Then we can keep using schema.org's informal DefinedTerm when quoting something already defined elsewhere (without representing their hierarchy etc) - in which case we should say where with url and inDefinedTermset etc! See for instance https://trefx.uk/trusted-wfrun-crate/0.3/ro-crate-preview.html#https%3A//w3id.org/shp%23CheckValue that quotes https://w3id.org/shp#CheckValue without re-declaring all its superclasses etc.

(I think I'll add to our profile docs that you use the inverse inDefinedTermSet when quoting only SOME of the terms from a defined-elsewhere termset, or the hierarchical hasDefinedTerm when they are all defined there in the profile. )

This would however not permit your tooling to know how to apply these new terms, as it would not know if they are classes, properties or something else -- without parsing its defining ontology which we know from OWL imports can be a massive minefield can of worms, and can be using a myriad of different ontology standards and formats (and defining those explicitly is what I would want to add in the FAIR-IMPACT Semantic Artefact Crate).

This was my reasoning for adding lightweight Class and Property more as indicator, they have no hierarchy, but formally is what's expected by their domainIncludes and rangeIncludes rather than the rdfs variants. So it's inconsistent in schema.org's internal definition unless you consider these to be equivalent with rdfs counterparts.

With the loosening in #260 to no longer require to have a known schema.org type, so we can have just rdfs:Class and rdfs:Property when defining inline.

BTW, the rdfs namespace is sadly also without a human-readable variant, so http://www.w3.org/2000/01/rdf-schema#Class just gives a Turtle file -- should therefore define these in the Crate profile in which case they would have to be declared as schema.org Class-es by my logic above! ;-D

stain commented 4 months ago

Follow generally Crate-O's mode file to do Schema.org-style schemas:

Avoid too many types.

stain commented 1 month ago

This is now implemented in #262 etc. as part of the revamp on Profiles.