spdx / spdx-java-jackson-store

JSON storage implementation for the SPDX tools
Apache License 2.0
3 stars 7 forks source link

Validation accepts invalid SPDX YAML files #21

Closed sschuberth closed 2 years ago

sschuberth commented 3 years ago

For testing, I did download https://github.com/spdx/spdx-spec/blob/3d156919a06c07781bd47dc26c75efcfc9df21cd/examples/SPDXYAMLExample-2.2.spdx.yaml and changed relationships (plural) to relationship (singular) in line 636:

https://github.com/spdx/spdx-spec/blob/3d156919a06c07781bd47dc26c75efcfc9df21cd/examples/SPDXYAMLExample-2.2.spdx.yaml#L363

This deliberately renders the syntax invalid. However, uploading the broken file for validation at https://tools.spdx.org/app/validate/ shows that the SPDX document is valid:

image

goneall commented 3 years ago

Good catch @sschuberth - I'll transfer this issue to the https://github.com/spdx/spdx-java-jackson-store repository which implements the JSON validation.

goneall commented 3 years ago

The issue is related to the JSON-LD file generated by the tools-java OwlToJSONContext utility.

This file is used to interpret the property names. It looks like for every list property, it creates a singular type in this line then creates the plural version of the type with the additional @set attribute.

I added an issue against the spec to discuss the JSON-LD context file changes: https://github.com/spdx/spdx-spec/issues/570

If we decide to fix the JSON-LD context file, I'll move this issue over to the tools-java repo. If we decide not to fix the JSON-LD, I'll see if there is some way to generate an error. Based on the current design, adding a validation warning for this could be challenging.

sschuberth commented 3 years ago

Based on the current design, adding a validation warning for this could be challenging.

Is that maybe a reason to re-think the current design for SPDX 3.0?

goneall commented 3 years ago

Is that maybe a reason to re-think the current design for SPDX 3.0?

It is more of an application design issue rather than an SPDX spec issue. We separated the serialization/deserialization code from the model through a well defined interface which has the advantage of easily adding additional serialization formats. Currently, all the validation uses the model code and isn't even aware of how the data was deserialized. We would need to add an interface to collect warnings from the layer that does the serialization/deserialization.

goneall commented 2 years ago

Resolved with PR #22