what do you think of LinkML-OWL?

VladimirAlexiev commented 4 months ago

I quite relish YAML (I started the YAML-LD WG, which is part of the JSON-LD CG).

I like the idea of being able to write

-
  id: x:a
  label: foo
  definition: a foo is a foo
  definition_source:
  - Me

And getting that translated to

Ontology( <https://w3id.org/linkml/owl/tests>
    AnnotationAssertion( rdfs:label x:a "foo" )
    AnnotationAssertion(
        Annotation( dcterms:source "Me" )
        <http://purl.obolibrary.org/obo/IAO_0000115> x:a "a foo is a foo"
    )
)

johanwk commented 3 months ago

This is completely new to me! It looks interesting and nicely compact.

Do you think this is a format that would fit well with ELOT? It would certainly be possible to output LinkML-OWL instead of OMN. However, I'm thinking that you would perhaps prefer to allow for YAML to be added directly to org sections.

(I should mention that, for a templating mechanism to build ontologies, I have myself a strong preference for OTTR, at http://ottr.xyz. That language and tool has the great advantage of allowing for complex modelling patterns to be composed of simple patterns, in a way that can be nicely checked. Update: I now saw https://linkml.io/linkml-owl/comparison/)

Just to throw out one possible idea: In an ELOT file, one would write something like the following to match your example above.

*** foo (x:a)
 - obo:IAO_0000115 :: a foo is a foo

So, the ontology resource (maybe a class) is entered as an org-mode section -- and for convenience, you can write the rdfs:label first, then put the puri in parentheses. (BTW, I noticed that LinkML talks about CURIEs, which is perhaps a better term than "puri".)

There's no support in ELOT currently for using more readable strings in the description list of annotations, so obo:IAO_0000115 has to be used, where it would be more convenient to be able to define a synonym and write

 - definition :: a foo is a foo

I think a way to allow such synonyms could be added without much trouble.

johanwk commented 3 months ago

More importantly, with ELOT in its current state, any axiom you wish to add to a resource has to be added in an explicit OMN source block. This is good in the sense that it's very explicit, but it definitely has a downside, in that you have to write everything out in full, including making sure, manually, that you keep the org section header in sync with the OMN block.

This LinkML example is very compact (link).

-
  id: x:a
  part_of:
  - x:b

In ELOT, the part_of statement would need an OMN block to be added into the org section:

*** foo (x:a)
 - obo:IAO_0000115 :: a foo is a foo
#+begin_src omn
Class: x:a
  SubClassOf: obo:BFO_0000050 some x:b
#+end_src

Maybe we could make use of LinkML and instead do like the following, which is way more attractive.

*** foo (x:a)
 - definition :: a foo is a foo
 - part_of :: x:b

Is it clear what I mean here? I.e., the translation mechanism from the org description list into OWL output would directly use a LinkML schema (please tell me if I'm using "schema" incorrectly).

Adding LinkML as an option would not necessarily mean OMN blocks can't still be used, for anything that doesn't have a LinkML template. However, it looks to me like the linkml-data2owl program currently only outputs Turtle, so there might be something to think about there.

The Elisp functions I made for ELOT (long ago..) allow for using the org-mode editing facilities in a nice way. Org is also great for tracking progress, including with org-ql.

use the org outliner for the hierarchies of classes and properties. This is really convenient! Use TODOs and more! (Downside is, if a section needs more than one superclass/superproperty you need to add OMN explicitly, but this is in practice not needed very often).
use org description lists efficiently. You can have markers like [X] or [ ] to keep track of whether an entry is complete, for example.
Add any kind of additional documentation, including diagrams with rdfpuml.

If the need for OMN blocks could be avoided with a templating mechanism for "magic" description list entries, the whole tool would be quite a bit more user friendly.

johanwk commented 3 months ago

Having thought about this overnight, I'm inclined to think that a allowing for OMN syntax in a description list that mimics the Protégé interface could work well.

*** foo (x:a)
 - obo:IAO_0000115 :: a foo is a foo
 - SubClassOf :: obo:BFO_0000050 some x:b

I.e., when the description term is "SubClassOf" (or "EquivalentTo", "DisjointWith", etc.), then ELOT will output what would otherwise be contained in a separate OMN block.

This is not to dismiss the LinkML idea, which would have various advantages, but I'm thinking this minor improvement would make sense to implement first.

VladimirAlexiev commented 3 months ago

Do you think this is a format that would fit well with ELOT?

It depends on your vision for ELOT.

If you want to base it on YAML then that's closer to LinkML
If you want to base it on Orgmode (and the last example above is not YAML, it's an Orgmode definition list) then that's farther from LinkML

I do love Orgmode but I find it hard to get collaborators to use it since emacs requires a good investment of time, and all people already have their favorite IDE (eg VSCode is excellent ... all the way to VIM). Whereas with YAML, and either YAML-LD or LinkML generation tooling, I think you can convince a bigger audience to write their ontologies that way.

But that would be a complete overhaul of your idea... I don't want to be disruptive, so this issue is just to provoke thinking.

strong preference for OTTR

Templates are great for the TBox/ABox writer, but not so great for the consumer. Correct me if I'm wrong, but it's not so easy to know and consume only the "template interface" triples.

LinkML schema (please tell me if I'm using "schema" incorrectly).

The LinkML schema specifies:

What definition, part_of etc mean, i.e. what to map them to. The same can be done with a JSON-LD context (which I prefer to write in YAML)
What attributes/slots are required/expected for each node/class. The same can be done with SHACL/SHEX.

any axiom you wish to add to a resource has to be added in an explicit OMN source block. This is good in the sense that it's very explicit, but it definitely has a downside, in that you have to write everything out in full OMN blocks can still be used, for anything that doesn't have a LinkML template.

I like this stance!

Use YAML for the "20% of features that are used 80% of the time"
Escape into Manchester for the "80% more advanced features" that are rarely used
- BTW we can embed Manchester in YAML by using tags and (if needed) text blocks
- Note: YAML-LD has datatype tags, eg dct:created: !xsd!date 2024-03-11

linkml-data2owl program currently only outputs Turtle, so there might be something to think about there.

I guess we can use a Turtle->Manchester convertor, but have to check that the output is still "nice" in terms of completeness, ordering...
Or we can interpret YAML-LD to RDF by using a JSON-LD context, then do extra transformations
Or we can write our own output direct to Manchester.

johanwk commented 3 months ago

No reason to hold back -- I really welcome this chance to discuss a "complete overhaul"!

The examples I wrote, with ***, are all to indicate how LinkML could be used as a target with org-mode as source. It would be possible to pick up uses of strings defined with LinkML, and have them mixed into org-mode description lists. But in that case, the LinkML schema would have to be authored separately.

Indeed, the tool I have been using (now named ELOT) is Emacs-only, and I agree the threshold to learning Emacs is quite high. But, my aim so far has been just to to enable org-mode as a format for ontology authoring. I have seen it as essential that the exporting features of org-mode become available, as well as the flexibility of the org-babel notebook, throwing in bits of any programming language for documentation or generating ontology. I have sometimes included tables of data to be processed with OTTR to generate individuals, that works well too. So, org-mode provides this great environment to work in.

Then again, repeating my motivations can only go so far, and your point about collaboration is super important. A great maxim is, "kill your darlings". I need to think about this for a bit.

But, maybe you could share your opinion on the following claim: That an outliner is very important for ontology authoring (yes, mainly thinking subclass or subproperty hierarchy). Would this sit well with the YAML format? E.g., when you have

-
  id: x:a
  subclass_of:
  - x:b

is it possible to proceed with something like the following?

-
  id: x:a
  subclass_of:
  - x:b
    subclass_of:
    - x:c

This is meaningless, right? I mean, since there's no id: to the left of the - x:b.

But maybe something like this could be defined and work in LinkML. And, maybe it would be possible to create a setup where one could author an ontology with just YAML, then say "let's switch to org-mode" and add diagrams and text and so forth. Some resources come to mind.

https://orgmode.org/manual/Include-Files.html (very basic)
write code that is like #+include, but picks up LinkedML fragments, translates in to org on export to make a nice document
org-babel-detangle (useful, but not very reliable, see here
https://github.com/nobiot/org-transclusion

None of these are proper answers, but I'm pretty sure it would be possible to use an org-mode document living beside a LinkedML document (or documents), pulling in contents from there and documenting them.

johanwk commented 3 months ago

As of 10edb141559cdb2b66332715a4a20499b914f8bb, I've added support for OMN restrictions in org description lists. This means that OMN source blocks will be needed much more rarely.

Example from elot-template.org, declaring a class ex:C1:

*** I'm C1 (ex:C1)
 - rdfs:comment :: The class label may be entered as a heading, with
   the URI in parentheses.
 - rdfs:seeAlso :: rdl:sdf
   - rdfs:comment :: lksjd
 - SubClassOf :: ex:r only ex:C2
   - rdfs:comment :: describe the restriction
 - SubClassOf :: ex:C3
 - SubClassOf :: ex:C4

What's good about this is,

OMN restrictions can be entered alongside the annotations, no need for omn blocks
it's simple to annotate the restrictions

johanwk commented 3 months ago

This change doesn't do anything towards allowing LinkML/Yaml or OTTR "template" content. We can follow that up over time, but I think for now, it's more important to get the basics in place.

johanwk commented 2 months ago

I wonder whether we should (a) close this, or (b) do something with it.

Question: is it possible to get OMN output from LinkML? If so, then how about starting with the following: support for LinkML org-babel code blocks. These could be mixed in with other content, adding to the OMN output.

johanwk / elot

what do you think of LinkML-OWL? #10