anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.02k stars 555 forks source link

SBOM generated from poetry lock file contains no license information on any dependencies #3204

Open nfelt14 opened 1 week ago

nfelt14 commented 1 week ago

What happened:

I am unable to generate an SBOM that contains license information on dependencies from a poetry lock file.

What you expected to happen: I would expect an SBOM to contain license information.

Steps to reproduce the issue:

  1. Use poetry to generate a lock file
  2. perform a scan on the file

Anything else we need to know?:

Environment:

nfelt14 commented 1 week ago

There is also no license information for any of the GitHub actions that are used in the repo.

spiffcs commented 1 week ago

Thanks @nfelt14 for the issue! I didn't know poetry.lock allowed for a field that contained license metadata. I tried searching for the specific documentation outlining the poetry.lock specification and only found this: https://github.com/orgs/python-poetry/discussions/6763

Do you have an example project with a lot of licenses we could use as a basis for development?

The only example I could find in our org has a single license ukkonen

[[package]]
name = "identify"
version = "2.6.0"
description = "File identification library for Python"
optional = false
python-versions = ">=3.8"
files = [
    {file = "identify-2.6.0-py2.py3-none-any.whl", hash = "sha256:e79ae4406387a9d300332b5fd366d8994f1525e8414984e1a59e058b2eda2dd0"},
    {file = "identify-2.6.0.tar.gz", hash = "sha256:cb171c685bdc31bcc4c1734698736a7d5b6c8bf2e0c15117f4d469c8640ae5cf"},
]

[package.extras]
license = ["ukkonen"]

Is this license under extras the correct field, or is this extras an optional package that requires the package ukkonen?

I'm unclear on which field we should be grabbing to associate a license to the [[package]]

I also noticed here that the license for identify is MIT: https://github.com/pre-commit/identify/blob/main/setup.cfg

This does NOT show up in our poetry.lock when consuming this package as you can see above. Is there an extra option when running poetry that would populate the field?

nfelt14 commented 1 week ago

We are trying to generate SBOMs for this repo: https://github.com/tektronix/tm_devices

The workflow is here: https://github.com/tektronix/tm_devices/actions/workflows/sbom-scan.yml

After I spent more time looking into it, it may be a lack of information that poetry provides, so I don't know if there is much that can be done on this side.

spiffcs commented 1 week ago

No worries! This looks like a good candidate for https://github.com/anchore/syft/issues/1115

nfelt14 commented 1 week ago

No worries! This looks like a good candidate for #1115

  • If there is a url we can use from the poetry lock we can probably enhance the cataloger to color in this license information if you're running syft in an environment where you don't care about network connections going out to find more information about what is being cataloged into the SBOM.

That would work great!