CycloneDX / cyclonedx-core-java

CycloneDX SBOM Model and Utils for Creating and Validating BOMs
https://cyclonedx.org/
Apache License 2.0
73 stars 56 forks source link

Version 9.0.x creates XML files that fail validation #409

Closed sschuberth closed 1 week ago

sschuberth commented 1 month ago

Our test in ORT fails after the upgrade to release 9.0.0 because the generated XML files do not seem to pass CycloneDX's own validation. Calling

XmlParser().validate(...)

on a generated files gives

Collection should be empty but contained org.cyclonedx.exception.ParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element '{"http://cyclonedx.org/schema/bom/1.5":licenses}'. One of '{"http://cyclonedx.org/schema/bom/1.5":copyright, "http://cyclonedx.org/schema/bom/1.5":cpe, "http://cyclonedx.org/schema/bom/1.5":purl, "http://cyclonedx.org/schema/bom/1.5":swid, "http://cyclonedx.org/schema/bom/1.5":modified, "http://cyclonedx.org/schema/bom/1.5":pedigree, "http://cyclonedx.org/schema/bom/1.5":externalReferences, "http://cyclonedx.org/schema/bom/1.5":properties, "http://cyclonedx.org/schema/bom/1.5":components, "http://cyclonedx.org/schema/bom/1.5":evidence, "http://cyclonedx.org/schema/bom/1.5":releaseNotes, "http://cyclonedx.org/schema/bom/1.5":modelCard, "http://cyclonedx.org/schema/bom/1.5":data, WC[##other:"http://cyclonedx.org/schema/bom/1.5"]}' is expected.

sschuberth commented 1 month ago

Some more notes: While it could be seen from the above, I'd like to make explicit that I'm generating a schema 1.5. XML, not schema 1.6. And this is the generated file that does not pass validation.

nscuro commented 1 month ago

Looks like the output has duplicate component.licenses nodes (removed license.text for brevity):

<component type="library" bom-ref="NPM:@ort:concluded-license:1.0">
  <group>@ort</group>
  <name>concluded-license</name>
  <version>1.0</version>
  <scope>required</scope>
  <hashes>
    <hash alg="SHA-1">0000000000000000000000000000000000000000</hash>
  </hashes>
  <licenses>
    <license>
      <id>MIT</id>
      <text content-type="plain/text">...</text>
      <ort:origin xmlns:ort="http://www.w3.org/1999/xhtml">concluded license</ort:origin>
    </license>
  </licenses>
  <licenses>
    <license>
      <name>MIT WITH Libtool-exception</name>
      <ort:origin xmlns:ort="http://www.w3.org/1999/xhtml">concluded license</ort:origin>
    </license>
  </licenses>
  <licenses>
    <license>
      <id>BSD-3-Clause</id>
      <text content-type="plain/text">...</text>
      <ort:origin xmlns:ort="http://www.w3.org/1999/xhtml">declared license</ort:origin>
    </license>
  </licenses>
  <licenses>
    <license>
      <id>BSD-2-Clause</id>
      <text content-type="plain/text">...</text>
      <ort:origin xmlns:ort="http://www.w3.org/1999/xhtml">detected license</ort:origin>
    </license>
  </licenses>
  <licenses>
    <license>
      <id>MIT</id>
      <text content-type="plain/text">...</text>
      <ort:origin xmlns:ort="http://www.w3.org/1999/xhtml">detected license</ort:origin>
    </license>
  </licenses>
  <copyright>Copyright 1, Copyright 2</copyright>
  <purl>pkg:npm/%40ort/concluded-license@1.0?classifier=sources</purl>
  <modified>false</modified>
  <externalReferences>
    <reference type="website">
      <url>https://github.com/oss-review-toolkit/ort</url>
    </reference>
  </externalReferences>
  <ort:dependencyType xmlns:ort="http://www.w3.org/1999/xhtml">direct</ort:dependencyType>
</component>

It should be one licenses node with multiple license children.

sschuberth commented 1 month ago

It should be one licenses node with multiple license children.

Yes, see https://github.com/CycloneDX/cyclonedx-core-java/issues/408, which I filed before realizing the two are connected.

sschuberth commented 1 month ago

Thanks for the changes in #411 @mr-zepol, I just gave release 9.0.1 a try. While that fixes the issue with the licenses tag, there now seems to be a new validation error:

Collection should be empty but contained org.cyclonedx.exception.ParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element '{"http://cyclonedx.org/schema/bom/1.5":externalReferences}'. One of '{"http://cyclonedx.org/schema/bom/1.5":reference}' is expected.

To reproduce, you could try running https://github.com/oss-review-toolkit/ort/blob/renovate/major-cyclonedx/plugins/reporters/cyclonedx/src/funTest/kotlin/CycloneDxReporterFunTest.kt. Feel free to ping me on Slack if you need assistance with that.

Also, I realized that all our custom ExtensibleTypes are now not serialized anymore. I need to check on that.

sschuberth commented 1 month ago

Also, I realized that all our custom ExtensibleTypes are now not serialized anymore. I need to check on that.

This broke in 9.0.1, it worked in 9.0.0.

mr-zepol commented 1 month ago

Thanks for the changes in #411 @mr-zepol, I just gave release 9.0.1 a try. While that fixes the issue with the licenses tag, there now seems to be a new validation error:

Collection should be empty but contained org.cyclonedx.exception.ParseException: cvc-complex-type.2.4.a: Invalid content was found starting with element '{"http://cyclonedx.org/schema/bom/1.5":externalReferences}'. One of '{"http://cyclonedx.org/schema/bom/1.5":reference}' is expected.

To reproduce, you could try running https://github.com/oss-review-toolkit/ort/blob/renovate/major-cyclonedx/plugins/reporters/cyclonedx/src/funTest/kotlin/CycloneDxReporterFunTest.kt. Feel free to ping me on Slack if you need assistance with that.

Also, I realized that all our custom ExtensibleTypes are now not serialized anymore. I need to check on that.

PR for External Serialization has been created https://github.com/CycloneDX/cyclonedx-core-java/pull/413

mr-zepol commented 4 weeks ago

Add extensible types during license serialization #414

This is fixed here https://github.com/CycloneDX/cyclonedx-core-java/pull/414

skhokhlov commented 3 weeks ago

Looks like the issue is still there:

CycloneDX: Validating BOM
Unknown keyword meta:enum - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword or if it should generate annotations AnnotationKeyword
Unknown keyword deprecated - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword or if it should generate annotations AnnotationKeyword

https://github.com/CycloneDX/cyclonedx-gradle-plugin/pull/444

nscuro commented 3 weeks ago

Can you share more details as to what part of the BOMs is causing validation to fail? I checked the build logs of the PR you linked but it doesn't show test output.

FWIW the output you shared are "just" warnings emitted by the JSON schema validator: https://github.com/networknt/json-schema-validator/blob/master/src/main/java/com/networknt/schema/UnknownKeywordFactory.java

skhokhlov commented 3 weeks ago

Well, you're right, thank you for the clue. The actual error is that the BOM has duplicated values

Duplicate unique value [pkg:maven/javax.activation/javax.activation-api@1.2.0?type=jar] declared for identity constraint of element "bom".
Duplicate unique value [pkg:maven/javax.activation/javax.activation-api@1.2.0?type=jar] declared for identity constraint of element "bom".
Duplicate unique value [pkg:maven/javax.activation/javax.activation-api@1.2.0?type=jar] declared for identity constraint of element "bom".
Duplicate unique value [pkg:maven/org.jvnet.staxex/stax-ex@1.8?type=jar] declared for identity constraint of element "bom".
Duplicate unique value [pkg:maven/org.jvnet.staxex/stax-ex@1.8?type=jar] declared for identity constraint of element "bom".
Duplicate unique value [pkg:maven/org.jvnet.staxex/stax-ex@1.8?type=jar] declared for identity constraint of element "bom".

bom.json

I believe it happens because new classes introduced with version 9 do not implement equals and hashCode https://github.com/CycloneDX/cyclonedx-core-java/blob/master/src/main/java/org/cyclonedx/model/license/Expression.java. At the same time, LicenseChoice calling it https://github.com/CycloneDX/cyclonedx-core-java/blob/master/src/main/java/org/cyclonedx/model/LicenseChoice.java#L75

In the gradle plugin, components are stored in a HashSet https://github.com/CycloneDX/cyclonedx-gradle-plugin/blob/master/src/main/java/org/cyclonedx/gradle/CycloneDxTask.java#L316

nscuro commented 3 weeks ago

Yeah that makes sense, if the library is overriding equals and hashCode it should do so consistently.

mr-zepol commented 3 weeks ago

Good catch, I will raise a PR adding all the equals and hashCode for the classes that don't have it

sschuberth commented 2 weeks ago

Good catch, I will raise a PR adding all the equals and hashCode for the classes that don't have it

As a side node, esp. with all these POJO classes, switching to Kotlin and its data classes (which have auto-generated equals and hashCode) would save a lot of boiler plate code (and prevent from forgetting to add these methods).

mr-zepol commented 2 weeks ago

I created a PR to add all the missing methods https://github.com/CycloneDX/cyclonedx-core-java/pull/419

nscuro commented 1 week ago

Closing as resolved, please raise a new issue if you run into further issues.