Open amoeba opened 3 years ago
This looks pretty promising. With just a simple constraint:
mosaic:CampaignShape
a sh:NodeShape ;
sh:targetClass mosaic:00000001;
# Every Campaign has at least one mosaic:hasBasis triple
sh:property [
sh:path mosaic:00000034 ;
sh:minCount 1 ;
] .
PySHACL catches the exact problem we saw today:
; pyshacl -s shapes.shacl -df xml -sf turtle ../MOSAiC.owl
Validation Report
Conforms: False
Results (4):
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:minCount Literal("1", datatype=xsd:integer) ; sh:path <https://purl.dataone.org/odo/MOSAIC_00000034> ]
Focus Node: odo:MOSAIC_00000005
Result Path: <https://purl.dataone.org/odo/MOSAIC_00000034>
Message: Less than 1 values on odo:MOSAIC_00000005-><https://purl.dataone.org/odo/MOSAIC_00000034>
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:minCount Literal("1", datatype=xsd:integer) ; sh:path <https://purl.dataone.org/odo/MOSAIC_00000034> ]
Focus Node: odo:MOSAIC_00000008
Result Path: <https://purl.dataone.org/odo/MOSAIC_00000034>
Message: Less than 1 values on odo:MOSAIC_00000008-><https://purl.dataone.org/odo/MOSAIC_00000034>
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:minCount Literal("1", datatype=xsd:integer) ; sh:path <https://purl.dataone.org/odo/MOSAIC_00000034> ]
Focus Node: odo:MOSAIC_00000019
Result Path: <https://purl.dataone.org/odo/MOSAIC_00000034>
Message: Less than 1 values on odo:MOSAIC_00000019-><https://purl.dataone.org/odo/MOSAIC_00000034>
Constraint Violation in MinCountConstraintComponent (http://www.w3.org/ns/shacl#MinCountConstraintComponent):
Severity: sh:Violation
Source Shape: [ sh:minCount Literal("1", datatype=xsd:integer) ; sh:path <https://purl.dataone.org/odo/MOSAIC_00000034> ]
Focus Node: odo:MOSAIC_00000018
Result Path: <https://purl.dataone.org/odo/MOSAIC_00000034>
Message: Less than 1 values on odo:MOSAIC_00000018-><https://purl.dataone.org/odo/MOSAIC_00000034>
I just did a quick look-over to see what checks might make sense to implement as a first pass:
Campaign
has a single Basis
Campaign
has at least one Chief Scientist
Campaign
hosts at least one Event
(is this right?)isHostedBy
a single Campaign
Chief Scientist
is a chief scientist of at least oneCampaign
Deployment
has a single deployedSystem
Some of the other parts of the ontology are a bit confusing so I'll stop there and chat with @mpsaloha.
@amoeba those look like good suggestions for constraints! Note that the MOSAIC Ontology is in OWL, and so is OWA. Thus, while all Campaigns do have a Basis, that doesn't mean that all Campaigns in our Ontology must have an associated Basis, unless we decide to "require it" (hence SHACL which is CWA). Thus, the lack of some Campaign having a Basis or having a Chief Scientist was not an"it should have been there" (as you phrased it in your first comment on this Issue), but rather "it might be useful if it were there". There are LOTS OF additional "It might be useful" predicates I could have filled out in the MOSAIC Ontology, but I didn't do these for lack of time, or suspicion they would not be leveraged in our Web UI. Happy to discuss this further if this doesn't make perfect sense.
You might have to define OWA and CWA for me. Other than that, your comment makes sense.
What I want to do is help you and @laijasmine get the work you both need to do on MOSAiC done quickly and efficiently so if we can add SHACL validation rules to help catch things like the hasBasis thing then that'd make me happy.
Are any of the rules above ones you want?
all of the above rules look good to me except for the last one i'm not sure about and Mark will need to confim: Every Deployment has a single deployedSystem
Thanks @laijasmine. I'll touch base with @mpsaloha at some point here.
We discussed part of this on our salmantics call this week and we talked about the point above: Should this ontology be comprehensive over all of the MOSAiC expedition or just what PANGAEA or we have? We decided that we should aim to be comprehensive. We're presenting an PDF soon and are hoping to have some conversations about the ontology and the project as a whole with relevant folks.
I've merged an initial skeleton for this kind of checking onto the develop branch but haven't added all of the constraints I listed above. I'm going to leave this issue open with the intent to revisit this at some point.
We discovered on Slack today that some a property we expected (
mosaic:hasBasis
) wasn;t present on everymosaic:Campaign
and it should have been. Manually checking the ontology after every change is time-consuming and error-prone. It'd be great to write a set of SHACL shape constraints that we could use to check for some of these things.I'll build out a GHA that does some basic checking and then we could probably brainstorm a more complete set of checks and look at applying the process to the other ontologies.