spdx / spdx-3-model

The model for the information captured in SPDX version 3 standard.
https://spdx.dev/use/specifications/
Other
70 stars 46 forks source link

Single SpdxDocument per serialization #924

Open ilans opened 2 days ago

ilans commented 2 days ago

From SpdxDocument:

Any instance of serialization of SPDX data MUST NOT contain more than one SpdxDocument element definition.

Suggested rule:

[
  a sh:NodeShape ;
  sh:targetNode <https://spdx.org/rdf/3.0.1/terms/Core/SpdxDocument> ;
  sh:property [
    sh:path [ sh:inversePath rdf:type ] ;
    sh:maxCount 1 ;
    sh:message "Any instance of serialization of SPDX data MUST NOT contain more than one SpdxDocument element definition."
  ]
] .

Validation script:

import rdflib
from rdflib import SH
from pyshacl import validate

shapes_graph = rdflib.Graph()
shapes_graph.parse("https://spdx.github.io/spdx-spec/v3.0.1/rdf/spdx-model.ttl", format="turtle") 

new_shape = '''
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

[
  a sh:NodeShape ;
  sh:targetNode <https://spdx.org/rdf/3.0.1/terms/Core/SpdxDocument> ;
  sh:property [
    sh:path [ sh:inversePath rdf:type ] ;
    sh:maxCount 1 ;
    sh:message "Any instance of serialization of SPDX data MUST NOT contain more than one SpdxDocument element definition."
  ]
] .
'''

shapes_graph.parse(data=new_shape, format="turtle")

s = """
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.org/ns#> .
@prefix spdxcore: <https://spdx.org/rdf/3.0.1/terms/Core/> .

ex:MyAgent
    a spdxcore:Agent ;
    spdxcore:creationInfo _:MyCreationInfo .

_:MyCreationInfo
    a spdxcore:CreationInfo ;
    spdxcore:createdBy ex:MyAgent ;
    spdxcore:created "2024-09-04T20:25:34Z"^^xsd:dateTimeStamp ;
    spdxcore:specVersion "3.0.1" .

ex:SpdxDocument1
    a spdxcore:SpdxDocument ;
    spdxcore:creationInfo _:MyCreationInfo .

ex:SpdxDocument2
    a spdxcore:SpdxDocument ;
    spdxcore:creationInfo _:MyCreationInfo .
"""
data_graph = rdflib.Graph()
data_graph.parse(data=s, format="turtle")

conforms, report, message = validate(data_graph, shacl_graph=shapes_graph, allow_warnings=True, advanced=True, debug=False)
for violation in report.subjects(rdflib.RDF.type, SH["ValidationResult"]):
    message = report.value(subject=violation, predicate=SH["resultMessage"], )
    print(message)

Result: Any instance of serialization of SPDX data MUST NOT contain more than one SpdxDocument element definition.

Notes: Running this rule on a graph database with multiple SPDX documents would fail. Not sure if this is an interesting use case, just pointing it out...

goneall commented 1 day ago

Has this already been decided that we would have only one SPDX document? See issue #860