psychoinformatics-de / datalad-concepts

Other
3 stars 2 forks source link

Flip ontology and schema class naming? #67

Closed mih closed 6 months ago

mih commented 6 months ago

The purpose of these classes is that schema classes add specification on data structure to concept classes -- for a particular schema.

This means that any type information ending up on data recorded (for example for type designator slots) will reference schema element classes (so something like DatasetSE).

This is not nice, because it indicates that this is a somehow special dataset.

We should consider have a schema use simple class names for schema elements, and the ontology classes have a suffix.

Those ontology classes are already mixins, so that could be a sensible suffix. However, in the schema this is somewhat redundant:

classes:
  Demo:
    mixins:
      - DemoMixin

Alternative ideas:

classes:
  Demo:
    mixins:
      - DemoConcept
      - DemoOC # (Ontology Class/Concept)
      - DemoS # (Semantics; might actually read nice, like a plural S)
      - DemoRD # (borrowed from (RD)F: resource description. Nice analogy to RDF-S also
mslw commented 6 months ago

Initial thoughts below.

I actually have nothing against having a DemoMixin mixin. It clearly communicates its purpose (within linkml),

Along the following line, I think I like the spelled-out version the most (DemoConcept). I ran the names of a few existing classes in my mind, and it sounds ok to me.

Of the abbreviations, RD (Resource Description) probably wins? For alternatives, I searched around for "Concept" abbreviations and found CUI, used elsewhere as "Concept Unique Identifier", which kind-of sort-of fits (also, Italian / Latin - whom / to whom...). But in general, I like all abbreviated forms less, because without having spent time with them they make me stop and think.

jsheunis commented 6 months ago

I like DemoConcept the most, since it's the most intuitive for me and also connected to the repo name, i.e. concepts.

mih commented 6 months ago

My issue with Concept is the already overloaded meaning. It is also used to distinguish between general and versioned instances. For example a concept-DOI (or all-versions DOI).

We have DataladDataset has a class for the all-versions Concept of a dataset...

mih commented 6 months ago

Another idea would be to tackle it without inverting the naming schema. Rather than DatasetSE we could say DatasetObj.

This term would only appear as a value in a meta_type slot of a data document. It kinda makes sense to declare what kind of object something is here.

With that any schema could declare a DatasetObj variant, which would solve the original motivation (having an encoding and a decoding schema, that use the same class names, but can be used to convert to a different data structure -- while preserving all semantics.

mih commented 6 months ago

Maybe DemoRecord?

The point of these classes is to define structure based on a definition of content semantics. So they define a type of record. This would fit

meta_type: DatasetVersionRecord
mih commented 6 months ago

Or

DemoDocument

slightly longer then Record. But maybe

DemoDoc

as an abbreviation.

meta_type: DatasetVersionDoc
mih commented 6 months ago

Difficult decision. Object is most unlikely to collide with any concept.

One should consider that we have two more base classes in a schema (a container and an item in a container).

So if we go for Object, there would be a base class in schema_utils and it would be MetadataObject, and in a schema there would be

Analog naming for other candidate terms.

mslw commented 6 months ago

Maybe also consider DemoBase, as in base class?

mih commented 6 months ago

But DemoBase won't be a base class. It will be a derived class, and possibly never be derived from.

mslw commented 6 months ago

Ah, sorry, DemoBase would be suitable for the ontology class (I got stuck with the flipped naming... now your three previous comments make more sense too :laughing: )