sfb1451 / crc-schema-draft

https://sfb1451.github.io/crc-schema-draft/
0 stars 0 forks source link

Slot reuse - consequences for documentation #1

Closed mslw closed 9 months ago

mslw commented 9 months ago

The Slots available for a class can be defined in two ways: either directly inside class definition (as "attributes"), or independently as Slots (which are added to class, or potentially several classes, as "slots"). See linkml Tutorial and FAQ.

When a slot is used in a class, it is possible to refine its meaning with slot_usage (see: slot usage), including giving it a class-specific description.

However, the documentation pages are generated for slots. The class page shows a table of slots, but their descriptions are truncated - either at first sentence (actually, first dot, so trimming also at the first "e.g." or "i.e." ), or after a certain number of characters, replacing the rest with an ellipsis. This happens in the Markdown generation step, not in the html template!

And here comes a potential inconvenience.

Defining slots as attributes

So if we have a class that defines its slots as attributes, including a lengthy description (for a slot that is also called description - sorry about the ambiguity):

classes:
  Dataset:
    attributes:
      description:
        slot_uri: schema:description
        required: true
        description: >-
          General description of the dataset. It may summarize its
          purpose, scope, content, and potential applications. If a
          long description need to be split into paragraphs, each
          paragraph can be put into a dedicated column in this
          row. Language must be English.

we see the truncated description in the class page:

image

But the slot page (seen after clicking the slot name) lists the description in full (here truncated by just me taking a screenshot)

image

So far so good.

Defining slots on their own

However, for a slot that is defined on its own and added to the class with slot_usage override (e.g. because we want to use schema:description on several classes, with a more precise description for each one):

classes:
  Dataset:
    slots:
      - description
    slot_usage:
      description:
        required: true
        description: >-
          General description of the dataset. It may summarize its
          purpose, scope, content, and potential applications. If a
          long description need to be split into paragraphs, each
          paragraph can be put into a dedicated column in this
          row. Language must be English.
slots:
  description:
    slot_uri: schema:description
    description: A description of the item.

... the class page would still only show the truncated description in the table (a full description can be found by expanding the "LinkML source" section, but it is hardly optimal):

image

... and link to a generic slot definition, which only shows "class modifies slot: yes".

image

You can see and play with the second example in the slot-reuse branch of this repo.

At this moment I am not sure whether this behavior is modifiable (I would appreciate the ability to show non-truncated description in the class page), and whether it is a problem for our imagined use case - time will tell.

mslw commented 9 months ago

It seems it is modifiable.

The internals aren't hard to find: there's a Jinja template responsible for generating the Markdown, and it passes the table column content through enshorten, defined in doc generator (docgen): https://github.com/search?q=repo%3Alinkml%2Flinkml%20enshorten&type=code

There is a FAQ: can I customize the markdown generation for my schema site - answer is yes, custom Jinja templates can be placed in templates directory.