How to import images from tar using stream

x1a0b0 commented 9 months ago

https://github.com/fussybeaver/bollard/blob/d258ede0b35c3be972bb51586fda4a014998ce4b/src/image.rs#L1351

The issue: byte_stream.next()only read the first frame(the default capacity is 8k), not the complete content for large tar file.

Think aloud, import_image use Bytes as root_fs's type, which means we should read the tar file into memory one time, If the file is several gigabytes, this will be annoying.

fussybeaver commented 9 months ago

Yes, I believe it's bounded by 8k due to tokio's default Framing configuration ... happy to take any PR adjustments to make this better.

Re. your second point, it got changed to Bytes because Hyper changed Body to become less dependent on Tokio. Is there an obvious streaming byte implementation to use here, one that also doesn't use Tokio ?

x1a0b0 commented 9 months ago

Maybe just use http-body-util's StreamBody

bensmidt commented 5 months ago

Hello! I'm currently trying to use the import_image to import a docker image from a *.tar.gz file. I've been testing with the postgres image, creating a 'postgres.tar.gz' file using the docker save command provided by Docker's CLI. However, I continually receive the following error: "exit status 1: unpigz: skipping: : corrupted -- incomplete deflate data\n" despite docker load running perfectly through the CLI. I have tried with other commonly used images and receive the same error.

Am I correct in assuming that the root cause of this issue is that bytes only partially contains the full tar file as described in this issue? If so,

Are there are any workarounds to this that you know of?
It seems you're close to a working solution on russeltg's pull request. How close are you and do you know when this update will be generally available through the bollard crate?

fussybeaver / bollard

How to import images from tar using stream #380