anthonyharrison / lib4sbom

Library to ingest and generate SBOMs
Apache License 2.0
14 stars 10 forks source link

Same package name in different manifest files does not appear after parsing the SPDX / CycloneDx file #36

Closed rms-sth closed 2 months ago

rms-sth commented 2 months ago

Lets say I have two manifest files

The generated SBOM packages looks like this:

[
  {
    "name": "flask",
    "SPDXID": "SPDXRef-Package-python-flask-7aeb2a6c081f1782",
    "versionInfo": "2.0.3",
    "supplier": "NOASSERTION",
    "downloadLocation": "NOASSERTION",
    "filesAnalyzed": false,
    "sourceInfo": "acquired package info from installed python package manifest file: /dev/requirements.txt",
    "licenseConcluded": "NOASSERTION",
    "licenseDeclared": "NOASSERTION",
    "copyrightText": "NOASSERTION",
    "externalRefs": [
      {
        "referenceCategory": "SECURITY",
        "referenceType": "cpe23Type",
        "referenceLocator": "cpe:2.3:a:python-flask:python-flask:2.0.3:*:*:*:*:*:*:*"
      }
    ]
  },
  {
    "name": "flask",
    "SPDXID": "SPDXRef-Package-python-flask-4947d30b71b34501",
    "versionInfo": "2.0.3",
    "supplier": "NOASSERTION",
    "downloadLocation": "NOASSERTION",
    "filesAnalyzed": false,
    "sourceInfo": "acquired package info from installed python package manifest file: /requirements.txt",
    "licenseConcluded": "NOASSERTION",
    "licenseDeclared": "NOASSERTION",
    "copyrightText": "NOASSERTION",
    "externalRefs": [
      {
        "referenceCategory": "SECURITY",
        "referenceType": "cpe23Type",
        "referenceLocator": "cpe:2.3:a:python-flask:python-flask:2.0.3:*:*:*:*:*:*:*"
      }
    ]
  }
]

But after parsing with lib4sbom only one of the package is returned by get_packages()

Desire output: Both packages should be returned after parsing.

mastersans commented 2 months ago

Hey @rms-sth Is this issue also present in cyclonedx format ? also bug same as this was fixed in ISSUE so this may be fixed now can you try it out with latest commit of lib4sbom? Edit: My bad its not fixed but i do know what is causing it but i am curious on how sboms should handle cases like these.

anthonyharrison commented 2 months ago

@rms-sth The parsing works on the package name/version pair. The fact that there are two with the same name/version pairing means that only one is returned. I am curious to understand why you would want multiple instances of the same component included in a single SBOM. Can you explain your use case?

rms-sth commented 2 months ago

@anthonyharrison Suppose we have two projects, Project A and Project B, and our final project is a combination of these two. Each project has a requirements.txt file that contains the same package, fastapi=0.11.2. When generating the Software Bill of Materials (SBOM) for our final project, we want to include both instances of the fastapi=0.11.2 package, along with their respective source information (e.g., A/requirements.txt and B/requirements.txt).

Here's an example of how we want the information to be represented in the SPDX SBOM format:

{
  "spdxVersion": "SPDX-2.3",
  "dataLicense": "CC0-1.0",
  "idString": "SPDXRef-DOCUMENT",
  "documentNamespace": "<https://example.com/spdx-document.json>",
  "packages": [
    {
      "comment": "This package was obtained from the requirements.txt file.",
      "name": "fastapi",
      "version": "0.11.2",
      "sourceInfo": "A/requirements.txt"
    },
    {
      "comment": "This package was obtained from the requirements.txt file.",
      "name": "fastapi",
      "version": "0.11.2",
      "sourceInfo": "B/requirements.txt"
    }
  ]
}

And here's an example of how we want the information to be represented in the CycloneDX SBOM format:

{
  "bomFormat": "CycloneDX",
  "specVersion": "1.5",
  "serialNumber": "urn:uuid:3e671687-395b-41f5-a30f-a58921a69b79",
  "version": 1,
  "components": [
    {
      "type": "application",
      "name": "fastapi",
      "version": "0.11.2",
      "externalReferences": [
        {
          "type": "build-meta",
          "url": "file://A/requirements.txt"
        }
      ]
    },
    {
      "type": "application",
      "name": "fastapi",
      "version": "0.11.2",
      "externalReferences": [
        {
          "type": "build-meta",
          "url": "file://B/requirements.txt"
        }
      ]
    }
  ]
}

Our goal is to ensure that the SBOM accurately reflects the package dependencies from both Project A and Project B, even when they share common packages like fastapi=0.11.2. This will help in tracking the provenance of each package and maintaining transparency about the software components used in our final project.

anthonyharrison commented 2 months ago

@rms-sth Thanks for the explanation.

Each project should have a separate SBOM. Each project is independent. If you want to aggregate together into a single SBOM, you need to either combine the two projects so that you can then create a new SBOM which is the superset of both projects or merge the two SBOMs together. There are a number of tools which claim to merge SBOMs together (e.g. sbommerge)

However, the merged SBOM is really representing a system (consisting of two projects) and the correct way of representing this is through another type of BOM, the OBOM (Operations Bill of Materials) which is only supported by CycloneDX (and not yet by Lib4sbom).