CycloneDX / cyclonedx-cli

CycloneDX CLI tool for SBOM analysis, merging, diffs and format conversions.
https://cyclonedx.org/
Apache License 2.0
285 stars 60 forks source link

Merged SBOMs contain duplicates #326

Open alexthemark opened 1 year ago

alexthemark commented 1 year ago

Hello, and thanks for the awesome CLI -- it's helping me merge up a bunch of SBOMs. However, I noticed that my SBOMs have duplicate components in them.

I think this would be fixed by the latest version of the dotnet library -- specifically, due to https://github.com/CycloneDX/cyclonedx-dotnet-library/pull/216.

Is there a plan to upgrade this library and make a new release?

Thanks!

ataraxus commented 10 months ago

I would also be interested in this behaviour and possible workarounds...

andreas-hilti commented 10 months ago

CycloneDX-cli version 0.25.0 should now contain the latest cyclonedx-dotnet-library (specifically version 6.0.0). Thus, the behavior w.r.t. duplicates should be improved.

pepcots commented 2 months ago

Is it solved? I'm still having duplicates with merge command...

OscheibeSymbio commented 1 month ago

I am running into an issue where the result of 2 merged SBOMs create a invalid result. (Error message: "Validation failed at line number 14 and position 7: There is a duplicate key sequence 'will-cause-issues@0.0.0' for the 'http://cyclonedx.org/schema/bom/1.5:bom-ref' key or unique identity constraint.")

Here are the repro steps:

I have two input files "sbom1-metadata-component.xml" and "sbom2-anything.xml" from which I have removed as much information from as possible. Both of these files are still being reported as valid and the resulting error message is the same as the full SBOMs.

sbom1-metadata-component.xml

<?xml version="1.0" encoding="utf-8"?>
<bom xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" serialNumber="urn:uuid:a1f4c1c1-d3b6-4db4-b061-41223a62cb53" version="1" xmlns="http://cyclonedx.org/schema/bom/1.5">
  <metadata>
    <component type="application" bom-ref="will-cause-issues@0.0.0">
      <name>will-cause-issues</name>
    </component>
  </metadata>
  <components>
    <component type="library" bom-ref="pkg:nuget/NotRelevant@1.0.0">
      <name>NotRelevant</name>
    </component>
  </components>
</bom>

Note here the component within the metadata region. The NotRelevant component needs to be there so the result contains anything at all but the content of NotRelevant is not relevant.

sbom1-metadata-component.xml

<?xml version="1.0" encoding="utf-8"?>
<bom xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" serialNumber="urn:uuid:a1f4c1c1-d3b6-4db4-b061-41223a62cb53" version="1" xmlns="http://cyclonedx.org/schema/bom/1.5">
</bom>

I just used an empty SBOM to execute the merge command with, which still caused the issue. My normal SBOM is much bigger.

Using the docker image to merge these two files creates a result that will not pass the "validate" command.

docker run -v ${PWD}:/sbom cyclonedx/cyclonedx-cli validate --input-file sbom/sbom1-metadata-component.xml
docker run -v ${PWD}:/sbom cyclonedx/cyclonedx-cli validate --input-file sbom/sbom2-anything.xml
docker run -v ${PWD}:/sbom cyclonedx/cyclonedx-cli merge --input-files sbom/sbom1-metadata-component.xml sbom/sbom2-anything.xml --output-file sbom/merged-sbom.xml
docker run -v ${PWD}:/sbom cyclonedx/cyclonedx-cli validate --input-file sbom/merged-sbom.xml --input-version v1_5

The result contains two component references. One within the metadata region and one within the components region containing the same bom-ref. Error message: "Validation failed at line number 14 and position 7: There is a duplicate key sequence 'will-cause-issues@0.0.0' for the 'http://cyclonedx.org/schema/bom/1.5:bom-ref' key or unique identity constraint."

Is this behavior relevant to this issue here? This invalid SBOM prevents me from doing further analysis.

merged-sbom.xml

<?xml version="1.0" encoding="utf-8"?>
<bom xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" serialNumber="urn:uuid:ad5e9f70-3191-4486-878c-00bb9019de92" version="1" xmlns="http://cyclonedx.org/schema/bom/1.5">
  <metadata>
    <component type="application" bom-ref="will-cause-issues@0.0.0">
      <name>will-cause-issues</name>
    </component>
  </metadata>
  <components>
    <component type="library" bom-ref="pkg:nuget/NotRelevant@1.0.0">
      <name>NotRelevant</name>
    </component>
    <component type="application" bom-ref="will-cause-issues@0.0.0">
      <name>will-cause-issues</name>
    </component>
  </components>
</bom>

I have attached all files to this comment, but their content is the same as described above. sbom-repro.zip

Thank you for providing us with such a powerful open source tool and helping your users.

andreas-hilti commented 1 month ago

I think it is similar to this issue: https://github.com/CycloneDX/cyclonedx-cli/issues/364 in the sense that in the FlatMerge, it is not completely clear what should happen to the components in the metadata. The root cause is again here: https://github.com/CycloneDX/cyclonedx-dotnet-library/blob/57972c202d267366954599a948445196cedd0dda/src/CycloneDX.Utils/Merge.cs#L84-L88 together with this behavior here https://github.com/CycloneDX/cyclonedx-cli/blob/03b8019b24e847b6fdc91822eae2e9a220d525fa/src/cyclonedx/Commands/MergeCommand.cs#L92-L97 that if the metadata is not provided, the first non-null is used. Then the same component appears in the metadata and inside components.

andreas-hilti commented 1 month ago

As a workaround, you can for now specify the name using the "--name" option (or similar metadata).