anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.13k stars 563 forks source link

Python .whl files not detected on directory scan #2687

Open ariker opened 7 months ago

ariker commented 7 months ago

What happened: When scanning a directory for packages, which includes both .rpm and .whl files, the catalogers properly reported the .rpm files, but did not report the .whl files.

What you expected to happen: Syft should have reported both the .rpm and .whl files

Steps to reproduce the issue: Download .whl file(s) into a new directory and run the syft scan command. syft scan dir:.

Anything else we need to know?: I also attempted to force the python-package-cataloger to be active with: syft scan dir:. --select-catalogers +python-package-cataloger

I get the same result specifying the absolute or relative path to the directory.

Environment: Application: syft Version: 1.0.0 BuildDate: 2024-02-29T14:50:55Z GitCommit: 356f7c92b464b69be3a2a898cd98a63037eeadcc GitDescription: v1.0.0 Platform: windows/amd64 GoVersion: go1.21.7 Compiler: gc

tgerla commented 7 months ago

Hi @ariker, the python-package-cataloger actually looks for unpacked wheels and other Python packages, not for archives on the filesystem. The RPM file is picked up because we have a separate cataloger for RPMs called rpm-archive-cataloger. We would have implement a separate Python archive cataloger that picks up wheels (and probably eggs). I will move this to a feature request and put it on our backlog for consideration--if you're interested in working on it, let us know and we can point you in the right direction!