Closed levinwinter closed 3 months ago
There should be an integration test that demonstrates how to export an OCI image using the GRPC API: https://github.com/fussybeaver/bollard/blob/master/tests/export_test.rs
Note that, the link you provided uses the moby HTTP API (as opposed to buildkit's formal GRPC API), although this is also supported in Bollard through the build_image
method, you cannot export OCI images using that API, as documented here.
Thank you for the pointers! I'm not super familiar with moby/buildkit, so I took some time to get a basic understanding.
Instead of exporting OCI images, I'd like to use the local
or tar
exporters (docs) that simply dump the file system of the last layer. I managed to get this up and running by adding a tar
option to the ImageExporterEnum
and using the docker container gRPC driver.
However, I would prefer to use the build_image
method since it's easier and should, in theory (?), also support this. I tried extending that one, but I think it would also need to use the /grpc
endpoint as opposed to /session
(I keep getting issues with /moby.filesync.v1.FileSend/diffcopy
not being recognized by the server).
Let me know if you'd be okay with adding the local
/tar
exporter, and if so, where you see best fit. I'd be happy to prepare a PR!
Yes, adding the local
and tar
exporters to build_image
would require a couple of changes in Bollard and a PR is very welcome. I'm not so sure you need to handle the /grpc
endpoint - this is actually created by the moby docker server if you toggle the buildkit code path by providing a session
as part of the BuildImageOptions
.
The reason you get a /moby.filesync.v1.FileSend/diffcopy
error is because buildkit initiates a GRPC request to save the exported payload to disk, but the filesend
provider isn't registered as a routable part of the /session
endpoint.
One option is to add the filesend
plumbing to the /session
endpoint and parameterise it somehow, presumably by adding the output
field to BuildImageOptions
, though how that field is parsed and interpreted is probably what needs some thought.
Awesome! I managed to get the /session
endpoint to work with diffcopy
. I already tried this before, but I didn't know I also need to register it with the X-Docker-Expose-Session-Grpc-Method
header. Exporting a single file using tar
is now working.
I'm having issues with the local
exporter however. The read loop in FileSendImpl
seems to "hang", probably because the protocol is more complex when sending multiple files? Is there a reference implementation for this somewhere? I tried to hunt around buildkit
, but I'm not quite sure what exactly is expected.
The data that I receive in the loop is just empty packets (the number of empty packets being equal to the number of files that the last layer has).
The reference implementation for the diffcopy
/ and filesend (curiously called filesync
in buildkit) is here: https://github.com/moby/buildkit/blob/44ebf9071db49821538cd37c2687dd925c7c3661/session/filesync/filesync.go#L78
Although the whole end-to-end flow is somewhat spread across moby, buildkit and the buildx repositories (and quite difficult to follow)
It's possible that some information is stored in the GRPC header metadata, which is not handled in Bollard's implementation..
Be sure to rebase from master as the session headers should be registered uniformly with the grpc_handle
method.
Thank you! I'm now using the grpc_handle
method!
I think that the correct protocol is described here in the fsutil
repository.
To receive meaningful data when selecting the local
exporter (which sends multiple files), I needed to change the type of the streamed messages in the diffcopy
method of the FileSend
service. While before the messages were deseralized to empty BytesMessage
(i.e. BytesMessage { data: [] }
), I now receive meaningful fsutil.types.Packet
s that contain the filenames of the export. Do you have any idea why that could be?
// FileSync exposes local files from the client to the server.
service FileSync{
rpc DiffCopy(stream fsutil.types.Packet) returns (stream fsutil.types.Packet);
rpc TarStream(stream fsutil.types.Packet) returns (stream fsutil.types.Packet);
}
// FileSend allows sending files from the server back to the client.
service FileSend{
- rpc DiffCopy(stream BytesMessage) returns (stream BytesMessage);
+ rpc DiffCopy(stream fsutil.types.Packet) returns (stream fsutil.types.Packet);
}
To add to this: When exporting using tar
, the messages still need to be deserialized as BytesMessage
and only when exporting using local
one needs to use fsutil.types.Packet
. I guess depending on which is passed as an argument, we could chose the correct implementation. Though I must say I'm not sure why this is the case in the first place.
I've noticed that the buildkit protobuf has generated a separate FileSync
implementation that takes a Packet
: https://github.com/fussybeaver/bollard/blob/cf88562401ce4db01cb558373e59e3dcb39f61ef/codegen/proto/src/generated/moby.filesync.v1.rs#L867-L1128
So, maybe you just need to implement the trait with an appropriate Provider
that implements the fileutil as you pointed out, and hook it up to the session endpoint.
I tried to implement FileSync
, but that seems to be the wrong gRPC service/endpoint (/moby.filesync.v1.FileSync/DiffCopy
vs. /moby.filesync.v1.FileSend/DiffCopy
). From what I can understand when looking at the Go implementation, they have some sort of raw gRPC stream (of the FileSend
service) and just serialize either the BytesMessage
or Packet
. To be honest, I'm a bit stuck since I don't know whether something equivalent is possible using tonic
.
The only "idea" that comes to my mind is to copy the generated protobuf code and have a manual/alternative implementation at hand. But this feels super hack and I'm sure there must be a better way. If you have no idea I could also ask ob moby/buildkit.
Sounds a little weird, one thing you could try is to enable the jaeger tracing interface, which will let you drill down into the payloads sent from buildkit. https://github.com/moby/buildkit?tab=readme-ov-file#opentelemetry-support
Regardless, do keep this thread up-to-date if you get a breakthrough somehow by hacking around the protobuf files..
Sorry for the delay, I was not working on this :)
For the moment, my idea is to copy-paste the bit of auto-generated code that I get when changing the protobuf file to DiffCopy(stream fsutil.types.Packet)
into the repo and simply wire that up if the local
exporter is selected.
As for the "interface" on how to integrate it into bollard, I was thinking of adding a field outputs
to BuildImageOptions
that takes a Option<ImageBuildOutput>
. Currently, there is already a generic ImageBuildOutput
in bollard, but perhaps a nicer interface would be something like this.
enum ImageBuildOutput
where
T: Into<String> + Eq + Hash + Serialize,
{
/// Exports a tarball to the specified path.
Tar(T),
/// Exports the filesystem to the specified path.
Local(T),
}
Just sharing my ideas and keeping you updated. Will lyk once the PR is up :)
I would like to directly output the Docker build to local disk without creating an intermediate image. For this, Docker/BuildKit has the option of specifying an
--output
flag (docs). I tried to hunt around bollard, but it seems to me that this API isn't exposed at the moment. Since the Docker Engine API version bollard is targeting already includes this feature, I was wondering whether this is already possible and I simply didn't find it in the docs or else what's required to make this work. Thanks :)