Plan for merge into skopeo

cgwalters commented 3 years ago

I think this project is probably close to suitable to propose for inclusion into github.com/containers/skopeo/

There's stuff that are somewhere between nice-to-have and must-have:

API documentation
API review for stabilization (use of HTTP/1.1 seems OK, but is there an even simpler option?)
done: ~Sample code in e.g. Python (and/or extract the Rust code from https://github.com/ostreedev/ostree-rs-ext/pull/98 )~
done: ~Maybe drop the "run via listening on port" since it's an anti-pattern, even if useful for debugging?~

Also: should we think about trying to support writes via this proxy too? It would be a large scope increase and not important for my use cases right now. But it'd be good to avoid ruling it out.

cgwalters commented 3 years ago

Also: should we think about trying to support writes via this proxy too?

OK actually, in practice for testing stuff right now I am exporting the booted host as a container:

$ ostree-ext-cli container export --repo=/sysroot/ostree/repo 7d0941f170264ef62aabf2727bb979ccc958502384178b963d28f7baa6663c93 containers-storage:localhost/fcos

And today in ostree-ext's code that indirects in an inefficient way via an oci: directory.

Sooo....maybe we should support pushes too.

cgwalters commented 3 years ago

We're having a review session on this now and one thing that came up - if we used not-HTTP and instead e.g. DBus we would have support for file descriptor passing, which could make a lot of sense for this. For example, fetching the manifest could just be passing a sealed memfd, and most importantly fetching blobs could be passing back the read half of a pipe(). That would in turn naturally allow things like concurrent blob fetches.

cgwalters commented 3 years ago

Another thing that came up: we need to double check our validation of blobs - in HTTP it's hard for a server to return an error at the very end of writing data.

Options:

The proxy doesn't validate blob checksums, it's on the caller
The proxy adds an API like /checkerrors that the caller has to invoke
We address this by having the proxy do decompression too - then we need to return a chunked encoding (decompressed length may be unknown) and chunked encoding inherently supports returning errors in e.g. the final chunk. A downside of this though is it's a lot more data to copy over the socket/pipe.
Or, we just use not-HTTP for this (per above) which would more naturally support errors

lucab commented 3 years ago

I guess one key design point is whether the client wants streaming data or bulky blobs. The latter seems overall easier to handle (through sealed memfd and without HTTP, as you mentioned), if it still meets the expected requirements.

cgwalters commented 3 years ago

I guess one key design point is whether the client wants streaming data or bulky blobs.

I am not aware of any case in which a caller cares about byte-by-byte parsing, it's basically "fetch this small JSON blob" or "fetch and streaming parse a large tarball". A model of memfd chunks can be mapped to a byte stream too of course. But probably simplest is pipe().

cgwalters commented 3 years ago

The more I think about the more it does feel like this should be generalized to an out-of-process containers/image backend - more precisely an out-of-process implementation of Image. Which would mean we support something like: skopeo copy docker://quay.io/coreos/fedora:latest socket://3 or so where we pass fd 3 as a socketpair. A tricky aspect of this then is that everything becomes push rather than pull from the perspective of the caller, and the API becomes more involved. But it's much more rigorous.

cgwalters commented 3 years ago

xref https://github.com/krustlet/krustlet/pull/564#issuecomment-841325245 it would also be good to split out the Rust code for interacting with this to a separate crate, as mentioned.

cgwalters commented 3 years ago

https://github.com/containers/skopeo/pull/1476

cgwalters commented 3 years ago

OK I spent about 2 hours trying to do a DBus interface https://github.com/cgwalters/container-image-proxy/tree/dbus and what I ended up hitting is that the Go DBus bindings are simply not prepared to act as a "raw server" in the DBus sense.

It's not trivial to implement.

And then what I realized is that there's the better option for this case I used for https://github.com/coreos/bootupd/blob/main/src/ipc.rs - SOCK_SEQPACKET where we do something super simple like JSON inside.

mtrmac commented 3 years ago

Another thing that came up: we need to double check our validation of blobs - in HTTP it's hard for a server to return an error at the very end of writing data.

FWIW if the consumer cares about digest validation, it seems conceptually simpler for the consumer to do its own validation. Of course Skopeo, or whatever other server, can technically do that validation just as well (and the code has already been written); it just feels simpler to me to say “we rely on digest validation, so here’s the code that always enforces it (and maybe a unit test that clearly demonstrates that)” than relying on cross-repo coordination and integration tests.

An entirely unrelated note: A bit surprisingly, the SHA256 computation is quite noticeable in the total run time, vs. network copies and (memory-cached) disk writes. So if performance is very important for this use, it might be worth comparing the performance of the Go implementation, and the client’s (Rust?) one.

cgwalters / container-image-proxy

Plan for merge into skopeo #1