sigstore / sigstore-python

A Sigstore client written in Python
https://pypi.org/p/sigstore
Other
220 stars 43 forks source link

Produce a pure-Python verification API #770

Open di opened 1 year ago

di commented 1 year ago

Description

Some installers that may want to eventually perform signature verification have a hard requirement that all their dependencies are pure-Python (pip is the predominant example, because it vendors all its dependencies into a single pure-Python wheel).

Because sigstore-python has sub-dependencies that ship non-pure Python wheels, it's not immediately usable from these installers. However, installers will specifically only use a subset of our overall API (presumably just verification) and might not have a need for all the dependencies we have with native code.

Given that, we should:

At a high level, looking at current sub-dependencies that ship non-pure Python wheels or have sub-dependencies that ship non-pure Python wheels shows the following:

jku commented 1 year ago

charset-normalizer==3.2.0 (impure)

charset-normalizer has a universal wheel too

jku commented 1 year ago

multidict==6.0.4 (impure)

Multidict claims that the library has optional C Extensions for speed. There's no universal wheel though, this will need a closer look.

di commented 1 year ago

Interesting, I wonder why they ship impure wheels as well.

woodruffw commented 1 year ago

To address cryptography and friends, the elephants in the room 🙂

  1. X.509 certificate parsing is currently done via cryptography, which implements it in pure Rust (subsequent chain building is done via pyOpenSSL, which uses C to call into an OpenSSL or OpenSSL-like backend)
  2. Signature verification (SET, SCT, certificate) is similar (calls into C via cffi in cryptography)
  3. Small associated bits are also written in Rust internally (SCT parsing)
  4. Transitively, we also depend on things like PEM parsing (since we accept certificates/chains in PEM format)

On that front, there's currently an effort (which I'm working on with others at ToB) to support X.509 path building in cryptography with a pure Rust implementation (https://github.com/pyca/cryptography/pull/9405, https://github.com/pyca/cryptography/pull/8873), meaning that a future version of sigstore-python hopefully won't need pyOpenSSL at all, which will also remove the cffi dep. However, that just exchanges one native dep (C) for another (Rust), so that is potentially not immediately useful here, besides reducing the overall total number of native deps 🙂

TL;DR: When path validation is merged, it should be possible to eliminate pyOpenSSL and cffi as dependencies, although cryptography will continue to be an impure dep (and we will further rely on its native bits).

Removing cryptography outright is a bigger challenge, and I can see two (non-exhaustive) possibilites:

jku commented 1 year ago

Documenting the native code requirements is a very good idea, but for the end goal we'll also want to look at the dependency tree as a whole: if the subset of the dependency tree (that is not part of e.g. pip dependency tree already) is too large, then pip maintainers might not be enthusiastic about vendoring attempts.

The point I'm making is that putting a lot of effort into fixing the native code situation is not useful if the end result will still be unacceptable for vendoring because of the size of the dependency tree...

jku commented 1 year ago

multidict==6.0.4 (impure)

Multidict claims that the library has optional C Extensions for speed. There's no universal wheel though, this will need a closer look.

This looks like a build system issue: it's supported but the CD builder just doesn't build the universal wheel