Open jwarwick-delfi opened 1 year ago
Since Pex vendors a very limited set of 3rd party libraries it uses, sticking to the stdlib is best; so text or JSON are preferred from the Pex point of view.
Hrm. A quick read of the spec seems to suggest each file must have 1 sha1 checksum and then 0 or more other checksums: https://spdx.github.io/spdx-spec/v2.3/file-information/#84-file-checksum-field
A lockfile only contains sha256 checksums and so generating a valid SPDX will require downloading every artifact in a lockfile and re-fingerprinting it down to sha1. This is not awesome.
Ok, the code that implements pex3 lock export ...
is here:
https://github.com/pantsbuild/pex/blob/fd9a07f3cc4e8a3f64eb2c9850f7936c67453315/pex/cli/commands/lock.py#L493-L516
That currently exports for just 1 distribution target, where a distribution target in Pex-speak is a particular local Python interpreter or else a foreign platform's interpreter. If your SBOM will be attached to a single platform in this way (say 1 SBOM per each of Python 3.7, 3.8 and 3.9 and per Linux and Mac for a total of 6 SBOMs), then all is well, you just run export six times configuring a different target for each run. If your SBOMs are intended to be singular and need to incorporate data for all distribution targets, a new sub-command is probably warranted pex3 lock sbom ...
. Either way, the key data structure is contained in lock_file
on line 500. That is a XXX
and is defined here:
https://github.com/pantsbuild/pex/blob/32a0789ee4d431f0d84b3f1e924bb91b78cde1cd/pex/resolve/lockfile/model.py#L29-L51
If, instead of exporting an entire lockfile as an SBOM, individual built-PEX files could export (or even include) an SBOM, things become alot simpler since the actual used software is all present along with licenses and other metadata. Re-hashing becomes ecosystem-friendly, etc.
There is already a suite of tools that can either be included in a PEX file by using --include-tools
when building the PEX or else by using the pex-tools
console script installed alongside pex
.
These live here: https://github.com/pantsbuild/pex/tree/main/pex/tools/commands
The repository
, graph
and venv
commands all do portions of the work that will be needed here - in particular they resolve
the PEX's distributions.
Perhaps best is to start looking at graph
which generates a graphviz svg graph of a PEX's internal software which is part way to an SBOM.
The run
main entrypoint of the tool is here:
https://github.com/pantsbuild/pex/blob/e0efca098404a6093f514839103ea7843920e4fa/pex/tools/commands/graph.py#L154-L156
The PEX resolve is done here: https://github.com/pantsbuild/pex/blob/e0efca098404a6093f514839103ea7843920e4fa/pex/tools/commands/graph.py#L33-L49
And the resolved things are Distribution
s defined here:
https://github.com/pantsbuild/pex/blob/e0efca098404a6093f514839103ea7843920e4fa/pex/dist_metadata.py#L549-L550
As a consumer of Pex lockfiles via the pants build tool, I would like to export a lockfile in an open format that I can use to generate a software bill of materials (SBOM). SPDX seems to be the widely-used open standard for these files.
SPDX can be expressed in a variety of formats, personally I would prefer text, JSON, or YAML.