BlueBrain / nexus-forge

Building and Using Knowledge Graphs made easy
https://nexus-forge.readthedocs.io
GNU Lesser General Public License v3.0
37 stars 20 forks source link

Issues with resource validation #280

Open joni-herttuainen opened 1 year ago

joni-herttuainen commented 1 year ago

Hi, I was playing with the KnowledgeGraphForge.validate. There's a few issues.

Can't pre-validate if there are files

If I call validate on a resource that has files to it, to check if it's valid before registering it, I get:

<action> _validate_one
<succeeded> False
<error> ValidationError: resource has lazy actions which need to be executed before

if I do:

forge.validate(resource, execute_actions_before=True)
<action> _validate_one
<succeeded> False

and the validation fails, and if I do not register the resource due to it failing, I have now effectively uploaded a file that will never be used by anything in Nexus.

Error message is unclear

If I validate a registered resource, in this case DetailedCircuit, the output is:

<action> _validate_one
<succeeded> False
<error> ValidationError: 
Validation Report
Conforms: False
Results (1):
Constraint Violation in AndConstraintComponent (http://www.w3.org/ns/shacl#AndConstraintComponent):
    Severity: sh:Violation
    Source Shape: this:DetailedCircuitShape
    Focus Node: <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test>
    Value Node: <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test>
    Message: Node <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test> does not conform to all shapes in [ sh:node this1:ModelInstanceShape ] , [ sh:property [ rdfs:seeAlso <https://neuroshapes.org/dash/edgecollection/shapes/EdgeCollectionShape> ; sh:class nsg:EdgeCollection ; sh:description Literal("Location of nrn synapse file and additional circuit description files: start.ncs and start.target") ; sh:name Literal("nrnPath") ; sh:path nsg:edgeCollection ], [ rdfs:seeAlso <https://neuroshapes.org/dash/nodecollection/shapes/NodeCollectionShape> ; sh:class nsg:NodeCollection ; sh:description Literal("Node collection entity.") ; sh:minCount Literal("1", datatype=xsd:integer) ; sh:name Literal("Node collection") ; sh:path nsg:nodeCollection ], [ rdfs:seeAlso <https://neuroshapes.org/dash/target/shapes/TargetShape> ; sh:class nsg:Target ; sh:description Literal("Optional parameter giving location of predefined targets stored in the named file") ; sh:maxCount Literal("1", datatype=xsd:integer) ; sh:name Literal("TargetFile") ; sh:path nsg:target ; skos:editorialNote Literal("constrain with application/bbp-target", datatype=xsd:string) ] ]

I investigated this further, and went all the way to

kgforge/specializations/models/rdf/store_service.py:StoreService._validate

to realize that the error message comes as such from the pyshacl.Validator and is of type rdflib.term.Literal which is basically a string.

I don't know if you can do a more customized validator with pyshacl that would give more meaningful error messages. Like by inheriting the validator and creating a custom create_validation_report (https://github.com/RDFLib/pySHACL/blob/master/pyshacl/validate.py#L125) but I feel the current error message is not clear enough.

Like I know that the current message means it's missing properties but it is not clear from that message. It would be nice if instead of all that it'd output something like:

<action> _validate_one
<succeeded> False
<error> ValidationError: 
Validation Report
Conforms: False
Results (1):
Constraint Violation in AndConstraintComponent (http://www.w3.org/ns/shacl#AndConstraintComponent):
    Severity: sh:Violation
    Source Shape: this:DetailedCircuitShape
    Focus Node: <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test>
    Value Node: <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test>
    Message: Node <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test> is missing following properties: EdgeCollection, NodeCollection, TargetFile

or, even with less output if possible, like:

<action> _validate_one
<succeeded> False
<error> ValidationError: <https://bbp.epfl.ch/nexus/v1/resources/nse/test2/_/O1.v6a.test> is missing following properties: EdgeCollection, NodeCollection, TargetFile
jdcourcol commented 1 year ago

@crisely09 I am not sure to understand the "debug" parameter and the "catch_exception" parameter mentioned in the above MR. I would expect a validation exception to be raised by default with the error message in the exception detailing the nature of the exception.

crisely09 commented 1 year ago

@jdcourcol the catch_exception makes the action (register/update/deprecate) to stop when there is an error, usually all errors are caught and things continue running. This is only if you want to have some other way to catch errors externally.