ropensci / emld

:package: JSON-LD representation of EML
https://docs.ropensci.org/emld
Other
13 stars 6 forks source link

Add additional validation checks #26

Closed cboettig closed 5 years ago

cboettig commented 5 years ago

@mbjones Can you take a quick look through this when you get a chance.

I believe this should address #25 by implementing the additional validation checks, as I understood them (modulo any handling of system. Currently I just treat all ids throughout the document directly, while I think technically I should be permitting identical ids that have different systems, and likewise making sure that references describes match the system and not just the id value, right?)

I believe I have understood the rule correctly about the use of id or references regarding an annotation (the annotation must have a subject, and only one can be the subject), but I'm not 100% sure. In particular, it looks like the current (but 5 mo stale) eml-data-paper.xml fails this test because there is a child annotation on a dataset node that has no id, and no references on the annotation. You'll see in this PR I've taken the liberty of modifying my local copy of that test file.

Also, I wasn't clear if the packageId needed to be included in the list of ids that had to be unique (or similarly, if references and describes were allowed to reference the packageId instead of an id). Currently I have followed the instructions literally, so packageId has to exist but that is all, it's not part of the other checks.

cboettig commented 5 years ago

From discussions elsewhere, sounds like this checks out well enough so far, so I'm going to merge this into master. Validation could still be improved, in particular, in testing (with more invalid files) and error message quality, but will coordinate this with @mbjones once the updates to the official java validator are complete as well.