INCATools / kgcl

Datamodel for KGCL (Knowledge Graph Change Language)
https://w3id.org/kgcl/
MIT License
11 stars 4 forks source link

Formally describe bundled and triggered changes (workflows) #70

Open cmungall opened 5 months ago

cmungall commented 5 months ago

KGCL has the concept of

Simple changes are atomic at the level of KGCL. E.g. kgcl:NodeMove could be broken down into a removed edge and add edge. However, NodeMove is still considered atomic/simple (it just so happens that there are potential rewrite rules for some change types). And of course at the RDF level this may involve many triples but that doesn't concern us here

Currently some KGCL implementations will trigger multiple changes from a single change. For example, in oak running apply on obsoletion will trigger removal of logical axioms and renaming as per (not particularly formally represented) OBO best practice, and similar to existing 'workflows' in Protege. It seems reasonable that we should try and formally represent this common workflow in KGCL.

In fact for obsoletions, the kgcl:Configuration class allows for the specification of an obsoletion policy, implemented as ObsoletionPolicyEnum - but there is no formal connection between the PVs and the logic.

Other ontologies may have more bespoke rules. See for example https://wiki.geneontology.org/Ontology_meeting_2024-04-08#Triggering_multiple_actions???

I don't think it makes sense to represent specific ontology rules in KGCL, but we may want some kind of mechanism for representing rules in general.

Currently the semantics of things like diff are not well defined here. I think when doing a diff we want a way to optionally "roll up" triggered changes. E.g. the renaming of "foo" to "obsolete foo" is not interesting in the way other renames are, same with "edge deletions". Currently oak hardwires a rule that these are ignored in the diff but this is not very satisfactory.

I propose we include a slot "triggers" or "triggered by" that could be used to better represent at the instance level triggered changes. This allows for a separation of concerns. diff calculation could infer these given two ontologies and an obsoletion policy (for example). Diff reporting could simply report these according to user preference.

We may also want to consider a rule language. e.g. to say "if change of type X happens, and the instance x has value v, then trigger change of type Y...". There are many interesting directions to go here but interesting is not necessarily good, we want this to be easy to implement in both oak and java-kgcl. Something like SPARQL construct would be easy to implement but would we run into expressivity issues?

gouttegd commented 5 months ago

I am mildly reticent to the idea of having configurable rules describing how exactly changes should be performed. I would much rather stay at the level of having configuration options, with the meaning of each option being appropriately described in the spec and hardcoded by the implementations.

That is, for all changes that may be performed in slightly different ways (for example obsoletion, where some ontologies may want to add an "obsolete" label and some may not, some ontologies may want to remove referencing axioms and some others may prefer to keep them, etc.), we gather the different possible behaviours, and we add as many configuration options in kgcl:Configuration to represent all those behaviours.

but there is no formal connection between the PVs and the logic.

I don’t think there needs to be a formal connection. As long as the PV is described clearly enough in plain English.

We may also want to consider a rule language […] we want this to be easy to implement in both oak and java-kgcl. Something like SPARQL construct would be easy to implement

I don’t know about OAK, but a SPARQL-based rule language would not be easy to implement in KGCL-Java. KGCL-Java uses the OWL API as backend, and AFAIK there is no “easy” way to execute SPARQL query against the OWL API – for example, the way ROBOT does it is basically by dumping the entire OWL ontology as a Turtle file, loading it into Jena, running the SPARQL query, dumping the updated graph as a Turtle file, and reloading it into the OWL API… It works, sure, but it’s pretty ugly and I will certainly not replicate that in KGCL-Java.

cmungall commented 5 months ago

Thanks! On reflection I agree with all this

On Tue, Apr 2, 2024 at 12:34 PM Damien Goutte-Gattat < @.***> wrote:

I am mildly reticent to the idea of having configurable rules describing how exactly changes should be performed. I would much rather stay at the level of having configuration options, with the meaning of each option being appropriately described in the spec and hardcoded by the implementations.

That is, for all changes that may be performed in slightly different ways (for example obsoletion, where some ontologies may want to add an "obsolete" label and some may not, some ontologies may want to remove referencing axioms and some others may prefer to keep them, etc.), we gather the different possible behaviours, and we add as many configuration options in kgcl:Configuration to represent all those behaviours.

but there is no formal connection between the PVs and the logic.

I don’t think there needs to be a formal connection. As long as the PV is described clearly enough in plain English.

We may also want to consider a rule language […] we want this to be easy to implement in both oak and java-kgcl. Something like SPARQL construct would be easy to implement

I don’t know about OAK, but a SPARQL-based rule language would not be easy to implement in KGCL-Java. KGCL-Java uses the OWL API as backend, and AFAIK there is no “easy” way to execute SPARQL query against the OWL API – for example, the way ROBOT does it is basically by dumping the entire OWL ontology as a Turtle file, loading it into Jena, running the SPARQL query, dumping the updated graph as a Turtle file, and reloading it into the OWL API… It works, sure, but it’s pretty ugly and I will certainly not replicate that in KGCL-Java.

— Reply to this email directly, view it on GitHub https://github.com/INCATools/kgcl/issues/70#issuecomment-2032533376, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOOZPOJFQUUCVPHA7ILY3LMZXAVCNFSM6AAAAABFLHWJU2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZSGUZTGMZXGY . You are receiving this because you authored the thread.Message ID: @.***>