containers / skopeo

Work with remote images registries - retrieving information, images, signing content
Apache License 2.0
8.14k stars 771 forks source link

Off-line verification of signatures against policy.json #2435

Open owtaylor opened 1 day ago

owtaylor commented 1 day ago

I'd like to have a way to provide a manifest (or manifest digest) and a set of previously downloaded signatures and have skopeo check that the manifest is correctly signed according to policy.json. The identity to check against policy.json would be provided separately, and correspond to the original source of the manifest.

The use case I need this for is checking downloaded content before installation with a tool not based on containers/image. In particular I'd like to be able to check signatures of Flatpaks stored as OCI images in an OCI registry.

It doesn't just work to do the download using skopeo copy or skopeo experimental-image-proxy because, among other things:

As a new subcommand

Proposal:

skopeo verify-installable --identity=<reference> <dir>

Where

is (possibly incomplete) directory in dir: format with manifest.json and signature-<n> files.

There's a proof of concept implementation of something similar as a separate binary https://gist.github.com/owtaylor/8658c6c22714ee4ec3d0353480062c9d

skopeo standalone-verify

There is of course, an existing standalone-verify subcommand. It, however, is not usable for this, because it only handles GPG signatures, and not sigstore signatures, and doesn't respect policy.json. It might be possible to compatibly extend it to this use case, maybe, along with the existing:

skopeo standalone-verify MANIFEST DOCKER-REFERENCE KEY-FINGERPRINTS SIGNATURE

We could have:

skopeo standalone-verify --system-signature-policy --load-signatures-from=DIR MANIFEST DOCKER-REFERENCE

Using experimental-image-proxy

@mtrmac suggested that skopeo experimental-image-proxy almost does the right thing already - calling the OpenImage JSON method already checks against policy.json. The only missing thing is being able to separate the identity being checked - which could be done as another argument to OpenImage or as a separate extended method.

An example of how calling OpenImage from GLib/Flatpak style C looks like can be found at: https://gist.github.com/owtaylor/06b9bbec53967677d4833ffa54d30598

From my perspective, problems with this are:

  • It's a lot of code to set up the socket connection and call JSON methods, and there are no real advantages for this use case.
  • This API is explicitly experimental - depending on it from Flatpak could make it difficult to evolve it in the future
  • Future evolution of experimental-image-proxy might involve depending on external protocol libraries (Cap'n Proto was proposed in https://github.com/containers/skopeo/pull/1844)

Is this liable to be misused (TOCTOU)

A concern that @mtrmac raised is that a "check signatures" API can be misused because someone might check that the signatures on docker://registry.example.com/some/image:latest are valid, then download and install it (via non containers/image means) without realizing that latest has been changed to some other image. I don't think either of the proposed commands above are suitable for misusing that way. You could, in theory, do:

skopeo copy docker://registry.example.com/some/image:latest dir:temp
skopeo verify-installable --identity=docker://registry.example.com/some/image:latest <temp>

But the skopeo copy command already does the policy.json check, so that you could just leave out the skopeo verify-installable entirely. And for that matter, just install what skopeo copy downloaded and avoid the TOCTOU problem.

experimental-image-proxy/OpenImage could already be misused in this fashion, but it's so much work to use it hopefully anybody doing that would think it through carefully.

cgwalters commented 1 day ago

It's a lot of code to set up the socket connection and call JSON methods, and there are no real advantages for this use case.

Yeah but many cases that start off as "fork off a binary" eventually grow more complex needs that turn into wanting an API.

This API is explicitly experimental - depending on it from Flatpak could make it difficult to evolve it in the future

We can't in practice break it today because doing so would break rpm-ostree and bootc, so...I dunno, maybe we should rename it to drop the experimental flag.

Future evolution of experimental-image-proxy might involve depending on external protocol libraries (Cap'n Proto was proposed

Yeah but I think that's up to us (i.e. including you). Another alternative is standardizing on varlink for this...what we have is somewhat close to that already. Would that help?

mtrmac commented 1 day ago

The use case…

(Just for the record; ACK, let’s figure this out.)


As a new subcommand

skopeo verify-installable --identity=<reference> <dir>

If we did do that, I’m not sure about hard-coding dir. Accepting any transports would work … but then again, hard-coding dir serves also as a bit of an abuse-discouragement mechanism.

skopeo standalone-verify

It might be possible to compatibly extend it to this use case

The difference in implementation effort between a new subcommand and a new option is fairly trivial — if we are adding a new kind of functionality, a new subcommand seems clearer to me.

Using experimental-image-proxy

An example of how calling OpenImage from GLib/Flatpak style C looks like can be found at: https://gist.github.com/owtaylor/06b9bbec53967677d4833ffa54d30598

From my perspective, problems with this are:

  • It's a lot of code to set up the socket connection and call JSON methods

I think a lot of that is that’s just C… and, partially, this library not being optimized for one-off use cases (but for GObject, which is so inconvenient that it is not used in that example).

Hypothetically, I think a varargs-style

json_root_parse_fields(root, &error,
    "success", G_TYPE_BOOLEAN, &response->success,
    "pipeid", G_TYPE_INT64, &response->pipeID …
     NULL,
)
root = json_root_from_fields(&error,
    "method", G_TYPE_STRING, method",
    "args", …
     NULL
)

would not be too bad (except, of course, that it doesn’t exist in the library right now) — and, most importantly, whatever structured format we use, it’s not going to get much easier. Compare sd-dbus. (Also, execve is attractive because it is already varargs-based.)

(Also, uh, the method calls can be constructed using printf, but you didn’t hear me say that — and that’s possible with JSON and not conveniently with most other IPC formats. The response parsing would still need to understand JSON as such, though, so that’s not much of a difference.)

and there are no real advantages for this use case.

I do think that having a structure beyond just strings is valuable. If nothing else, ~every such automated use case eventually asks for differentiating some error causes; that’s hard to promise as a stable API in human-oriented text output, so we are left with either the 7-bit exit code (and an inability to classify an error along two dimensions), or, again, something like JSON.

  • This API is explicitly experimental - depending on it from Flatpak could make it difficult to evolve it in the future

Pragmatically it ~has to be kept stable by now, and I don’t anticipate breaking changes.


Is this liable to be misused (TOCTOU)

A concern that @mtrmac raised is that a "check signatures" API can be misused … I don't think either of the proposed commands above are suitable for misusing that way. You could, in theory, do:

skopeo copy docker://registry.example.com/some/image:latest dir:temp
skopeo verify-installable --identity=docker://registry.example.com/some/image:latest <temp>

You’re right that users who already have an enforcing policy in the primary consumer are not at much risk. But they aren’t the ones most frequently asking for such a command.

The kinds of things users ask for are

skopeo verify-policy docker://example.com/image:latest
{docker pull, oc create deployment, …} … example.com/image:latest

or

skopeo --policy=… verify-policy … docker://example.com/image:latest
podman run example.com/image:latest # not enforcing policy; possibly on a different system, via Ansible or the like

(And then was is the very legitimate

mirroring-tool docker://vendor.example/product docker://mirror.internal/vendor/product
# Sanity-check that the images really are signed correctly, and that they are readable
for image in $(images_for_that_product); do
   skopeo [config to set up mirrors and policy like in production] verify-policy $image || exit
done
trigger-deployment-with-policy-enforcement [config to set up mirrors] docker://vendor.example/product

which really makes sense to want.)


If nothing else, the error handling area requires structured responses. And I think that, when Skopeo already has one machine-targeted structured IPC mechanism, it seems unintuitive to add another one.

owtaylor commented 1 day ago

Future evolution of experimental-image-proxy might involve depending on external protocol libraries (Cap'n Proto was proposed

Yeah but I think that's up to us (i.e. including you). Another alternative is standardizing on varlink for this...what we have is somewhat close to that already. Would that help?

I'd be happier about that - Flatpak already does extensive JSON handling and carrying an implementation of varlink using json-glib is more palatable than bringing in another IDL / type-system, etc. It doesn't seem a lot more complicated than the current image-proxy protocol.

cgwalters commented 1 day ago

Flatpak splits things so the download is done as the user, but the installation (and thus verification against policy.json) is done as root.

This came up when we were discussing a RHEL AI use case for instructlab, where it would actually make sense to have the image for that work by default in a similar way - pulled to the root storage, but executed unprivileged.

This whole problem domain isn't just "download as user", it's also "user can hold a runtime ref to avoid image GC" for example.

I think it would very much make sense to teach the c/image-alike stack to support things like this.

And that whole topic then goes into a much deeper thing around sharing more around container image storage.

Anyways, such changes to flatpak would be huge, so let's just dial in on the image fetching part for now, but keeping in mind larger structural changes.

And I think that, when Skopeo already has one machine-targeted structured IPC mechanism, it seems unintuitive to add another one.

Do you see an adaption of the existing IPC to be varlink to be more "standard" as aligning with that or from your PoV would it be closer to "another one"? I think it's likely we could deprecate the non-varlink experimental-image-proxy code say ~6months after we've shipped a new stable varlink API and then potentially remove it (or maybe at least "brownout" it, like make it fail 50% of the time) another ~6months later. My belief is there's almost no one updating skopeo but not bootc/rpm-ostree.

owtaylor commented 19 hours ago

skopeo verify-installable --identity=<reference> <dir>

If we did do that, I’m not sure about hard-coding dir. Accepting any transports would work … but then again, hard-coding dir serves also as a bit of an abuse-discouragement mechanism.

Yes, the idea of hardcoding dir: was basically to discourage the case of docker:// ; but it does also makes things weirder and less general, and less useful for other cases.

It's a lot of code to set up the socket connection and call JSON methods

I think a lot of that is that’s just C…

Some of it's just C, some of it is me being paranoid and wanting to "fail open".

Most importantly, whatever structured format we use, it’s not going to get much easier. Compare sd-dbus. (Also, execve is attractive because it is already varargs-based.)

Generally agree - calling remote API's from C/C++ is never beautiful unless you have an IDL compiler. My issue wasn't really that it was particularly hard to use this API, it's just that there's a lot of overhead (from C especially) to use a "machine interface" to Skopeo when you just want a single "yes" / "no"

I do think that having a structure beyond just strings is valuable. If nothing else, ~every such automated use case eventually asks for differentiating some error causes; that’s hard to promise as a stable API in human-oriented text output, so we are left with either the 7-bit exit code (and an inability to classify an error along two dimensions), or, again, something like JSON.

Thinking about this some, I'm not sure there's a meaningful error to display in a user interface (like GNOME Software) other than "Could not install application because it failed verification against system signature policy, see system logs for details". We can't really report up the information about the error in structured form, because meaningful failure reasons are really closely tied to the structure of policy.json - a meaningful error message that goes beyond the above looks like:

 Images installed from registry.redhat.io require a valid signature /etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release
  Signature 1: signed with a different key, fingerprint 199E2F91FD431D51
  Signature 2: valid, but repository registry.redhat.io/rhel9/firefox doesn't match source ... 
  ...

Which only containers/image has the knowledge to create. (Currently you get something like:

copying system image from manifest list: Source image rejected: None of the signatures were accepted, reasons: Invalid GPG signature: gpgme.Signature{Summary:128, Fingerprint:"199E2F91FD431D51", Status:gpgme.Error{err:0x9}, Timestamp:time.Date(2024, time.September, 23, 9, 18, 21, 0, time.Local), ExpTimestamp:time.Date(1969, time.December, 31, 19, 0, 0, 0, time.Local), WrongKeyUsage:false, PKATrust:0x0, ChainModel:false, Validity:0, ValidityReason:error(nil), PubkeyAlgo:1, HashAlgo:8}; Invalid GPG signature: gpgme.Signature{Summary:128, Fingerprint:"199E2F91FD431D51", Status:gpgme.Error{err:0x9}, Timestamp:time.Date(2024, time.September, 23, 9, 18, 21, 0, time.Local), ExpTimestamp:time.Date(1969, time.December, 31, 19, 0, 0, 0, time.Local), WrongKeyUsage:false, PKATrust:0x0, ChainModel:false, Validity:0, ValidityReason:error(nil), PubkeyAlgo:1, HashAlgo:8}; [...]

which is probably good enough if logged, but definitely isn't user consumable.)

So, my summary here is:

Perhaps the main question should be to what extent skopeo verify-installable would be useful for other use cases:

because while Flatpak can deal with a machine interface, you can't do that if you are writing a shell script or doing something on the CLI.

mtrmac commented 11 hours ago

And I think that, when Skopeo already has one machine-targeted structured IPC mechanism, it seems unintuitive to add another one.

Do you see an adaption of the existing IPC to be varlink to be more "standard" as aligning with that or from your PoV would it be closer to "another one"?

I think that’s fine — we need to be able to evolve things, I’d just like to avoid having two entirely separate “proxy-for-bootc” and “proxy-for-flatpak” subsystems.

I think it's likely we could deprecate the non-varlink experimental-image-proxy code say ~6months after we've shipped a new stable varlink API and then potentially remove it (or maybe at least "brownout" it, like make it fail 50% of the time) another ~6months later. My belief is there's almost no one updating skopeo but not bootc/rpm-ostree.

I’ll fully defer to you in that, I am not aware of any other user.


Specifically on varlink, Podman used to be varlink-based and that was abandoned. Reported concerns:

Personally … I’d like to run an IDL compiler and not think about it much at all, and I have no opinion on the specific protocol / implementation.

And a IDL is not really a blocker for adding one more OpenImage method right now, at the quite slow pace the proxy’s feature set has been expanding. At a larger scale, yes, this should be automated.

mtrmac commented 11 hours ago

I do think that having a structure beyond just strings is valuable. If nothing else, ~every such automated use case eventually asks for differentiating some error causes; that’s hard to promise as a stable API in human-oriented text output, so we are left with either the 7-bit exit code (and an inability to classify an error along two dimensions), or, again, something like JSON.

Thinking about this some, I'm not sure there's a meaningful error to display in a user interface (like GNOME Software) other than "Could not install application because it failed verification against system signature policy, see system logs for details".

Maybe:

I can see an argument that all of these could, on systems with few user-exposed options and good automated retry/recovery logic, be classified as “something went wrong, this is never supposed to happen, get an expert help”.

Also, c/image does not currently expose all of these things as API that could be detected, anyway.