CycloneDX / cdxgen

Creates CycloneDX Bill of Materials (BOM) for your projects from source and container images. Supports many languages and package managers. Integrate in your CI/CD pipeline with automatic submission to Dependency Track server. Discord: https://discord.gg/DP657ACYEZ
https://cyclonedx.github.io/cdxgen/
Apache License 2.0
565 stars 157 forks source link

SBOM generated for requirements-dev.txt even with `--required-only` flag #609

Open marcosanchotene opened 1 year ago

marcosanchotene commented 1 year ago

I copied the requirements-dev.txt of the pandas repository into an empty directory and ran cdxgen there with the --required-only flag. It generated a bom.json file with 250 components. Is that expected? I thought it would not find anything as the file name contais the -dev part, in the first place, but with the --required-only flag it should not look into dev components also. Am I missing something?

prabhu commented 1 year ago

@marcosanchotene, this is a difficult test case to solve without access to the source code. If you repeat this experiment on a legitimate repo with both source and requirements files, cdxgen would analyze the project using atom and correctly exclude optional dependencies.

marcosanchotene commented 1 year ago

That repository is public.

My first attempt was on the cloned repository, but I got 263 components. I tried to isolate that file to check if it was being scanned, and apparently it is.

prabhu commented 1 year ago

Ah ok. Let me take a look. Was Java>=17 installed? Are you seeing any errors when you set CDXGEN_DEBUG_MODE=debug?

marcosanchotene commented 1 year ago

Yes, I have Java 21:

$ java -version
openjdk version "21" 2023-09-19
OpenJDK Runtime Environment (build 21+35-2513)
OpenJDK 64-Bit Server VM (build 21+35-2513, mixed mode, sharing)

Well, just the Slicing was not successful error:

$ cdxgen
Scanning .
Performing babel-based package usage analysis with source code at .
Using virtual environment in /tmp/cdxgen-venv-0qp315
About to construct the pip dependency tree. Please wait ...
Using virtual environment in /tmp/cdxgen-venv-0qp315
About to construct the pip dependency tree. Please wait ...
Executing atom parsedeps -l python -o /tmp/atom-deps-ThAbSS/app.atom --slice-outfile /tmp/atom-deps-ThAbSS/slices.json /python/pandas/repo
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
Found 251 python packages at .
Found 0 csharp packages at .
Parsing /python/pandas/repo/meson.build
Parsing /python/pandas/repo/pandas/meson.build
Parsing /python/pandas/repo/pandas/_libs/meson.build
Parsing /python/pandas/repo/pandas/_libs/window/meson.build
Parsing /python/pandas/repo/pandas/_libs/tslibs/meson.build
Executing /usr/local/lib/node_modules/@cyclonedx/cdxgen/node_modules/@cyclonedx/cdxgen-plugins-bin/plugins/osquery/osqueryi-linux-amd64 --json select * from deb_packages where name like '%dev%' OR name like '%header%';
Executing /usr/local/lib/node_modules/@cyclonedx/cdxgen/node_modules/@cyclonedx/cdxgen-plugins-bin/plugins/osquery/osqueryi-linux-amd64 --json select * from portage_packages where name like '%dev%' OR name like '%header%';
Executing /usr/local/lib/node_modules/@cyclonedx/cdxgen/node_modules/@cyclonedx/cdxgen-plugins-bin/plugins/osquery/osqueryi-linux-amd64 --json select * from rpm_packages where name like '%dev%' OR name like '%header%';
Executing atom usages -l h -o /tmp/atom-deps-X9pmVv/app.atom --slice-outfile /python/pandas/repo/repo-usages.json /python/pandas/repo
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
Found 0 cpp packages at .
Parsing /python/pandas/repo/.github/workflows/wheels.yml
Parsing /python/pandas/repo/.github/workflows/unit-tests.yml
Parsing /python/pandas/repo/.github/workflows/stale-pr.yml
Parsing /python/pandas/repo/.github/workflows/package-checks.yml
Parsing /python/pandas/repo/.github/workflows/docbuild-and-upload.yml
Parsing /python/pandas/repo/.github/workflows/deprecation-tracking-bot.yml
Parsing /python/pandas/repo/.github/workflows/comment-commands.yml
Parsing /python/pandas/repo/.github/workflows/codeql.yml
Parsing /python/pandas/repo/.github/workflows/code-checks.yml
Parsing /python/pandas/repo/.github/workflows/cache-cleanup.yml
Parsing /python/pandas/repo/.github/workflows/cache-cleanup-weekly.yml
Found 12 GitHub action packages at .
BoM includes 263 components and 252 dependencies after dedupe
===== WARNINGS =====
[ 'Version is missing for metadata.component' ]
prabhu commented 1 year ago

Thank you. Yup, that's the error that is preventing the usage analysis from working correctly. Will take a look.

marcosanchotene commented 1 year ago

Ok. Many thanks!

prabhu commented 1 year ago

@marcosanchotene, could you confirm the RAM specification? We need a minimum of 8GB RAM for atom. It works fine locally for me.

marcosanchotene commented 1 year ago
marco@sol:/python/pandas/repo$ free -h
               total        used        free      shared  buff/cache   available
Mem:            15Gi       1,7Gi       8,7Gi       527Mi       4,9Gi        12Gi
Swap:           14Gi       530Mi        14Gi
marco@sol:/python/pandas/repo$ cdxgen --required-only
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
prabhu commented 1 year ago

@marcosanchotene, Thank you. Do you know if this is an invocation from within a container image? Or the linux distro used?

marcosanchotene commented 1 year ago

Do you mean an invocation of Atom? I'm not using a container image.

prabhu commented 1 year ago

Does it work for a smaller repo? Can you try generating a python sbom for the below repo?

https://github.com/owasp-dep-scan/dep-scan

prabhu commented 1 year ago

And could you also try the following

export JAVA_OPTS="-Xmx8G"
export CDXGEN_DEBUG_MODE=debug
cdxgen -t python --required-only
marcosanchotene commented 1 year ago
/python/dep-scan/repo$ cdxgen
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
/python/pandas/repo$ cdxgen -t python --required-only
Using virtual environment in /tmp/cdxgen-venv-VGz3Zs
About to construct the pip dependency tree. Please wait ...
Using virtual environment in /tmp/cdxgen-venv-VGz3Zs
About to construct the pip dependency tree. Please wait ...
Executing atom parsedeps -l python -o /tmp/atom-deps-rRYBp8/app.atom --slice-outfile /tmp/atom-deps-rRYBp8/slices.json /python/pandas/repo
Slicing was not successful. For large projects (> 1 million lines of code), try running atom cli externally in Java mode. Please refer to the instructions in https://github.com/CycloneDX/cdxgen/blob/master/ADVANCED.md.
NOTE: Atom is in detached mode and will continue to run in the background with max CPU and memory unless it's killed.
===== WARNINGS =====
[ 'Version is missing for metadata.component' ]
prabhu commented 1 year ago

What is the linux distro used?

prabhu commented 1 year ago

I would also check simple things like whether /tmp directory is writable. You can run atom in java mode by following the instructions here

https://cyclonedx.github.io/cdxgen/#/ADVANCED?id=use-atom-in-java-mode

If none of these work, try a different machine, or we need a troubleshooting session on zoom.

prabhu commented 1 year ago

@marcosanchotene Please retest with 9.8.7 in a while.

prabhu commented 1 year ago

@marcosanchotene any luck with the new version?

marcosanchotene commented 1 year ago

The bom.json file contains 251 components when generated with the command cdxgen --required-only using version 9.8.7.

prabhu commented 1 year ago

The bom.json file contains 251 components when generated with the command cdxgen --required-only using version 9.8.7.

Could you try with the latest? There are also other kinds of filtering available.

https://cyclonedx.github.io/cdxgen/#/ADVANCED?id=filtering-components