CISA-SBOM-Community / SBOM-Generation

Reference GitHub Workflows for SBOM generation from the CISA SBOM Generation Reference Implementation Tiger Team
Apache License 2.0
16 stars 4 forks source link

Phase 1: Container Image with Python application #14

Open vpetersson opened 3 months ago

vpetersson commented 3 months ago

Tracker for Container Image with Python application.

Work to be carried out in https://github.com/CISA-SBOM-Community/SBOM-Generation/pull/4

Todo

Container SBOM Generation

Tools Evaluated

Tool Version License Vendor Comment
Syft 1.10.0 Apache 2.0 Anchore Open Source but commercially backed
Trivy 0.54.0 Apache 2.0 Aqua Security Open Source but commercially backed

As per the discussion in the Tiger Team, the tools qualify as per the qualification criteria (citation needed).

Before we generate the result, we first need to build the Docker image:

$ docker build . -t phase-1-python

Result

Tool Format Download Command Comment
Syft CycloneDX SBOM syft phase-1-python -o cyclonedx-json \| jq > syft_container-sbom_cyclonedx.json Piped through jq for improved readability
Syft SPDX SBOM syft phase-1-python -o spdx-json \| jq > syft_container-sbom_spdx.json Piped through jq for improved readability
Trivy CycloneDX SBOM trivy image --format cyclonedx --output trivy_container-sbom_cyclonedx.json phase-1-python
Trivy SPDX SBOM trivy image --format spdx-json --output trivy_container-sbom_spdx.json phase-1-python
$ du -hs ./*
832K    ./syft_container-sbom_cyclonedx.json
4.1M    ./syft_container-sbom_spdx.json
272K    ./trivy_container-sbom_cyclonedx.json
272K    ./trivy_container-sbom_spdx.json
$ wc -l ./*.json
   24671 ./syft_container-sbom_cyclonedx.json
  106267 ./syft_container-sbom_spdx.json
   10431 ./trivy_container-sbom_cyclonedx.json
    7359 ./trivy_container-sbom_spdx.json
$ for i in *cyclonedx*.json; do echo $i && cat $i | jq '.components | length'; done
syft_container-sbom_cyclonedx.json
192
trivy_container-sbom_cyclonedx.json
180
$ for i in *spdx*.json; do echo $i && cat $i | jq '.packages | length'; done
syft_container-sbom_spdx.json
192
trivy_container-sbom_spdx.json
181

We need to dive deeper dive into the quality of these SBOMs, but based on the amount of data picked up (measured by LOC), syft appears to pick up a lot more.

Some rudamentary automated assesment using sbomdiff (0.5.4) yielded the following:

Application (requirements.txt) SBOM Generation

Note: We are only only looking at Build and Source SBOMs (reference), thus any tool that cannot take a requirements.txt file (or similar) is disqualified from the selection process.

Tools Evaluated

Tool Version License Vendor Comment
Syft 1.10.0 Apache 2.0 Anchore Open Source but commercially backed
Trivy 0.54.0 Apache 2.0 Aqua Security Open Source but commercially backed
cyclonedx-python 4.5.0 Apache 2.0 CycloneDX Limited to CycloneDX
sbom4python 0.11.0 Apache 2.0 Individual
Tool Format Download Command Comment
Syft CycloneDX SBOM syft requirements.txt -o cyclonedx-json \| jq > syft_application-sbom_cyclonedx.json Piped through jq for improved readability
Syft SPDX SBOM syft requirements.txt -o spdx-json \| jq > syft_application-sbom_spdx.json Piped through jq for improved readability
Trivy CycloneDX SBOM trivy fs --format cyclonedx --output trivy_application-sbom_cyclonedx.json requirements.txt
Trivy SPDX SBOM trivy fs --format spdx-json --output trivy_application-sbom_spdx.json requirements.txt Note that --format needs to be spdx-json to be JSON, whereas cyclonedx generates JSON
cyclonedx-python CycloneDX SBOM cyclonedx-py requirements requirements.txt > cyclonedx-python_application-sbom_cyclonedx.json
sbom4python SPDX SBOM sbom4python -r requirements.txt --sbom spdx --format json -o sbom4python_application-sbom_spdx.json
sbom4python CycloneDX SBOM sbom4python -r requirements.txt --sbom cyclonedx --format json -o sbom4python_application-sbom_cyclonedx.json
$ du -hs ./*
8.0K    ./cyclonedx-python_application-sbom_cyclonedx.json
8.0K    ./syft_application-sbom_cyclonedx.json
 12K    ./syft_application-sbom_spdx.json
4.0K    ./trivy_application-sbom_cyclonedx.json
4.0K    ./trivy_application-sbom_spdx.json
8.0K    sbom4python_application-sbom_cyclonedx.json
4.0K    sbom4python_application-sbom_spdx.json
$ wc -l ./*.json
     147 ./cyclonedx-python_application-sbom_cyclonedx.json
     206 ./syft_application-sbom_cyclonedx.json
     273 ./syft_application-sbom_spdx.json
     137 ./trivy_application-sbom_cyclonedx.json
     129 ./trivy_application-sbom_spdx.json
     176 ./sbom4python_application-sbom_cyclonedx.json
     119 ./sbom4python_application-sbom_spdx.json
$ for i in *cyclonedx*.json; do echo $i && cat $i | jq '.components | length'; done
cyclonedx-python_application-sbom_cyclonedx.json
3
syft_application-sbom_cyclonedx.json
3
trivy_application-sbom_cyclonedx.json
4
sbom4python_application-sbom_cyclonedx.json
3
$ for i in *spdx*.json; do echo $i && cat $i | jq '.packages | length'; done
syft_application-sbom_spdx.json
4
trivy_application-sbom_spdx.json
5
sbom4python_application-sbom_spdx.json
3

Merging Tool

Tools Evaluated

Tool Version License Vendor Comment
sbommerge 0.2.0 Apache 2.0 Individual
Hoppr TBD MIT Lockheed Martin Corporation Open Source but commercially backed

Annotation tool

TODO

Conformance Check

We want all our SBOMs to meet National Telecommunications and Information Administration (NTIA)'s minimum elements and have this checked automatically in the CI/CD pipeline.

Tool Version License Vendor Comment
sbomaudit TBD Apache 2.0 Individual
NTIA Conformance Checker TBD Apache 2.0
vpetersson commented 3 months ago

@CISA-SBOM-Community/sbom-generation-tiger-team I've posted some benchmarks for container and application generation. Please review and let me know what your thoughts. The qualification methodology is rather primitive but should hopefully help guide us at least.

vpetersson commented 3 months ago

Quantitative Analysis

Container

Summary

Tool Format Packages Unique Packages Duplication %
Syft CycloneDX 192 172 10.42%
Trivy CycloneDX 180 178 1.11%
Syft SPDX 192 173 9.90%
Tvivy SPDX 181 172 4.97%

Syft

$ jq '.components[] | .name' syft_container-sbom_cyclonedx.json | wc -l
     192

$ jq '.components[] | .name' syft_container-sbom_cyclonedx.json | uniq | wc -l
     173
$ jq '.packages[] | .name' syft_container-sbom_spdx.json | wc -l
     192

$ jq '.packages[] | .name' syft_container-sbom_spdx.json | uniq | wc -l
     173

Trivy

$ jq '.components[] | .name' trivy_container-sbom_cyclonedx.json | wc -l
     180

$ jq '.components[] | .name' trivy_container-sbom_cyclonedx.json | uniq | wc -l
     178
$ jq '.packages[] | .name' trivy_container-sbom_spdx.json | wc -l
     181

$ jq '.packages[] | .name' trivy_container-sbom_spdx.json | uniq | wc -l
     172

What we can conclude here is that both tools evaluated generated duplicates.

Application

Summary

Tool Format Packages Unique Packages Duplication %
cyclonedx-python CycloneDX 3 3 0%
sbom4python CycloneDX 3 3 0%
Syft CycloneDX 3 3 0%
Trivy CycloneDX 4 4 0%
sbom4python SPDXD 3 3 0%
Syft SPDXD 4 4 0%
Trivy SPDXD 5 5 0%
$ for i in *cyclonedx*; do echo $i; jq '.components[] | .name' $i | wc -l; jq '.components[] | .name' $i | uniq | wc -l; done

cyclonedx-python_application-sbom_cyclonedx.json
       3
       3
sbom4python_application-sbom_cyclonedx.json
       3
       3
syft_application-sbom_cyclonedx.json
       3
       3
trivy_application-sbom_cyclonedx.json
       4
       4
$ for i in *spdx*; do echo $i; jq '.packages[] | .name' $i | wc -l; jq '.packages[] | .name' $i | uniq | wc -l; done

sbom4python_application-sbom_spdx.json
       3
       3
syft_application-sbom_spdx.json
       4
       4
trivy_application-sbom_spdx.json
       5
       5

Verdict

Much like the findings in https://github.com/CISA-SBOM-Community/SBOM-Generation/pull/15, Trivy appears to generate less duplicate packages. The findings for the Application SBOM is less interesting.

The objective here isn't necessarily to do a deep dive into the output of these tools. The view is that they should be a good starting point and to be replaceable in the pipeline as they evolve.

With this in mind, I will proceed with trivy.

vpetersson commented 3 months ago

@CISA-SBOM-Community/sbom-generation-tiger-team review. I've implemented the first few steps. Please review the data in the comment above. If anyone has any arguments for why we should not use trivy for both the container and application step, please let me know. If not, I'm going to consider that part Phase 1 resolved.

For the next step, we need to find a tool for creating a hierarchy SBOM that includes both the container and application SBOMs. Note that "merging" these into a single SBOM isn't possible as they are of two different SBOM types (container and application). Thus a tool for creating a hierarchy is needed.

Tools that will be evaluated include bomctl and bomasm. Neither of these tools can do this in the current version but future versions promises this ability. Thus Phase 1 is somewhat blocked by this.

djmoch commented 3 months ago

I'm good with moving forward with Trivy.