Open ncoghlan opened 2 weeks ago
Thanks for opening this issue! This is a good idea, I can definitely put together some recommendations.
Requested by @sethmlarson.
For SBOMs to be easily adapted into the current workflows, the library/libraries required for those should be lightweight and possibly without dependencies.
My recent clash with supply-chain/signing libraries was when I investigated the possibility of integrating PEP 740 attestations generation into Poetry. Adding only pypi-attestations
library puts together the following dependency tree:
pypi-attestations 0.0.13
├── cryptography *
│ └── cffi >=1.12
│ └── pycparser *
├── packaging *
├── pyasn1 >=0.6,<1.0
├── pydantic *
│ ├── annotated-types >=0.6.0
│ ├── email-validator >=2.0.0
│ │ ├── dnspython >=2.0.0
│ │ └── idna >=2.0.0
│ ├── pydantic-core 2.23.4
│ │ └── typing-extensions >=4.6.0,<4.7.0 || >4.7.0
│ └── typing-extensions >=4.6.1
├── sigstore >=3.4,<4.0
│ ├── cryptography >=42
│ │ └── cffi >=1.12
│ │ └── pycparser *
│ ├── id >=1.1.0
│ │ ├── pydantic *
│ │ │ ├── annotated-types >=0.6.0
│ │ │ ├── email-validator >=2.0.0
│ │ │ │ ├── dnspython >=2.0.0
│ │ │ │ └── idna >=2.0.0
│ │ │ ├── pydantic-core 2.23.4
│ │ │ │ └── typing-extensions >=4.6.0,<4.7.0 || >4.7.0
│ │ │ └── typing-extensions >=4.12.2
│ │ └── requests *
│ │ ├── certifi >=2017.4.17
│ │ ├── charset-normalizer >=2,<4
│ │ ├── idna >=2.5,<4
│ │ └── urllib3 >=1.21.1,<3
│ ├── importlib-resources >=5.7,<6.0
│ ├── platformdirs >=4.2,<5.0
│ ├── pyasn1 >=0.6,<1.0
│ ├── pydantic >=2,<3
│ ├── pyjwt >=2.1
│ ├── pyopenssl >=23.0.0
│ │ └── cryptography >=41.0.5,<44
│ ├── requests *
│ ├── rfc8785 >=0.1.2,<0.2.0
│ ├── rich >=13.0,<14.0
│ │ ├── markdown-it-py >=2.2.0
│ │ │ └── mdurl >=0.1,<1.0
│ │ ├── pygments >=2.13.0,<3.0.0
│ │ └── typing-extensions >=4.0.0,<5.0
│ ├── sigstore-protobuf-specs 0.3.2
│ │ └── betterproto 2.0.0b6
│ │ ├── grpclib >=0.4.1,<0.5.0
│ │ │ ├── h2 >=3.1.0,<5
│ │ │ │ ├── hpack >=4.0,<5
│ │ │ │ └── hyperframe >=6.0,<7
│ │ │ └── multidict *
│ │ │ └── typing-extensions >=4.1.0
│ │ └── python-dateutil >=2.8,<3.0
│ │ └── six >=1.5
│ ├── sigstore-rekor-types 0.0.13
│ │ └── pydantic >=2,<3
│ └── tuf >=5.0,<6.0
│ ├── requests >=2.19.1
│ └── securesystemslib >=1.0,<2.0
└── sigstore-protobuf-specs *
└── betterproto 2.0.0b6
├── grpclib >=0.4.1,<0.5.0
│ ├── h2 >=3.1.0,<5
│ │ ├── hpack >=4.0,<5
│ │ └── hyperframe >=6.0,<7
│ └── multidict *
│ └── typing-extensions >=4.1.0
└── python-dateutil >=2.8,<3.0
└── six >=1.5
While some of the dependencies are already in the dependency tree, that is just way too much to include in the project. Some package builders/managers could handle that by using plugins, but that are extra steps instead of a simple built-in solution that we could have.
@sethmlarson I came across https://github.com/sethmlarson/pip-sbom by way of the pip
plugin/extension issue. What's are the primary points of concern leading to the "experimental" marker? Just the phantom dependency issue that means it misses a lot of things that aren't declared in the current distribution package metadata? Or do you have additional concerns beyond that limitation?
I was mostly poking at it to get an idea of what an SBOM dependency tree might look like (at least with current libraries):
$ poetry show --tree
pip-sbom 0.0.1a2 pip-sbom
├── cyclonedx-python-lib *
│ ├── license-expression >=30,<31
│ │ └── boolean-py >=4.0
│ ├── packageurl-python >=0.11,<2
│ ├── py-serializable >=1.1.1,<2.0.0
│ │ └── defusedxml >=0.7.1,<0.8.0
│ └── sortedcontainers >=2.4.0,<3.0.0
├── packageurl-python *
├── packaging *
├── pip *
└── spdx-tools >=0.8
├── beartype *
├── click *
│ └── colorama *
├── license-expression *
│ └── boolean-py >=4.0
├── ply *
├── pyyaml *
├── rdflib *
│ └── pyparsing >=2.1.0,<4
├── semantic-version *
├── uritools *
└── xmltodict *
(I also checked the anchore-syft
package you used in your latest blog post, but that's bundling an external binary command rather than being a regular Python package)
@ncoghlan
What's are the primary points of concern leading to the "experimental" marker?
Because SBOMs are used for regulatory things, I didn't want people to start using this tool that I've invested relatively small amounts of time in. I created this project mostly to test what is possible today for Python packages and as a place to implement draft packaging PEPs ahead of their acceptance to show how useful they'd be for the SBOM use-case (such as PEP 710 and now my upcoming PEPs for SBOMs).
And yeah... I am not particularly happy with the state of affairs for SBOM libraries. At the end of the day, it's a data format. I think creating a tiny module for specifically generating SBOM documents will make sense so that packaging tools can adopt it very easily? That use-case is constrained to a very narrow set of SBOM documents and features, typically.
Would it make sense to survey and recommend libraries for generating SBOM metadata for Python packages as part of this project?
Full disclosure: I'll actually need to add SBOM support to my current work project at some point (see https://github.com/lmstudio-ai/venvstacks/issues/67), so I have a concrete interest in knowing which libraries actually do a decent job of taking a set of Python dependency declarations (and/or installed environments) and turning them into the corresponding SBOM.