Open GrigoryEvko opened 3 weeks ago
Also on some steps I ended up with containerd bug present in latest 1.7.22 version, but not in 2.0 rc it seems, it was reported multiple times recently.
Do you have a link for that one?
Also it seems related to the recent fix in nerdctl #3079 , because zstdchunked converter looks almost identical to pre pull request nerdctl code.
Interesting.
You are using nerdctl v1.7 (which does not have the fix mentioned). Can you try with nerdctl v2.rc and see if the problem is still there?
Also on some steps I ended up with containerd bug present in latest 1.7.22 version, but not in 2.0 rc it seems, it was reported multiple times recently.
Do you have a link for that one?
It's one of your issues actually (https://github.com/containerd/nerdctl/issues/3509 and a few similar ones).
Unfortunately I don't have logs accessible at the moment, I need to reproduce it all, but during nerdctl image convert
at some point I started receiving msg="content digest sha256:..... not found
on top of the error messages, and then the context canceled
errors. estargz converter and ctr-remote optimize work perfectly fine with gzip compression.
Interesting.
You are using nerdctl v1.7 (which does not have the fix mentioned). Can you try with nerdctl v2.rc and see if the problem is still there?
Oh I thought it's already in the release. But looking at the code, I think it wouldn't help because zstd:chunked converter is using the same problematic module from snapshotter repo https://github.com/containerd/nerdctl/blob/2a0b9f56274bbed704858853fc9f7ff62d9a4fab/pkg/cmd/image/convert.go#L42
I need zstd:chunked lazy loading specifically, because with gzip, decompressing my 30 GB of layers utilizes 100% of CPU and barely hit the disk speed limit for a very prolonged time, so with the regular zstd image container startup time is 8-10 minutes approximately, depends on s3 throughput, and with estargz it's barely better, 5 minutes total for decompressing and disc writing. And considering that I need most of my 50 GB uncompressed image for ML inferencing anyway on the machine, it would still be disk bounded in any case, but I'd like to optimize it as much as possible :/ Thanks!
@GrigoryEvko
I see.
So:
This is definitely a stargz issue (since it happens with ctr as well)
Maybe we can fix it here (just rewrite the func in nerdctl and bypass stargz converter), so, a few comments:
I need tests for this - at least a reproducer.
Is there a chance you could share one of these images? Or provide a way for me to build one, that would be close enough to what you have to trigger the issue?
Cc @ktock for stargz
I need tests for this - at least a reproducer.
Absolutely, you can use nvidia triton inference server image, it reproduces the error. I kinda realized why it was not reported previously - it works as intended on smaller images. Also I've got a bug in nerdctl v2.0.0-rc3 image convert not being able to find images pulled from dockerhub (even with a URL), only from other registries. Here's my log:
(base) ubuntu@ip-10-0-17-13:~$ sudo nerdctl --version
nerdctl version 2.0.0-rc.3
(base) ubuntu@ip-10-0-17-13:~$ sudo nerdctl image ls -a
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
ghcr.io/containerd/stargz-snapshotter 0.15.1-kind-zstd af433c8521cb 51 seconds ago linux/amd64 0B 487.9MB
ghcr.io/containerd/stargz-snapshotter 0.15.1-kind 77742284151e 3 minutes ago linux/amd64 1.121GB 494.9MB
vitess/lite latest e3a3ff311b0a 7 minutes ago linux/amd64 2.352GB 734.1MB
ubuntu latest 99c35190e22d 10 minutes ago linux/amd64 87.56MB 29.75MB
ubuntu jammy 0e5e4a57c249 12 minutes ago linux/amd64 87.51MB 29.54MB
nvcr.io/nvidia/tritonserver 24.10-vllm-python-py3 6c9dcf2dbe0d 21 minutes ago linux/amd64 21.51GB 13.43GB
(base) ubuntu@ip-10-0-17-13:~$ sudo nerdctl image convert --oci --zstdchunked vitess/lite:latest vitess/lite:zstdchunked
FATA[0000] image "vitess/lite:latest": not found
(base) ubuntu@ip-10-0-17-13:~$ sudo nerdctl image convert --oci --zstdchunked ghcr.io/containerd/stargz-snapshotter:0.15.1-kind ghcr.io/containerd/stargz-snapshotter:0.15.1-kind-zstd
sha256:af433c8521cbb3a38a2b7ccd2b11fc09b1451bbe28fb3bbfb1a3217cc42950df
(base) ubuntu@ip-10-0-17-13:~$ sudo nerdctl image ls -a
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
ghcr.io/containerd/stargz-snapshotter 0.15.1-kind-zstd af433c8521cb 3 seconds ago linux/amd64 0B 487.9MB
ghcr.io/containerd/stargz-snapshotter 0.15.1-kind 77742284151e 5 minutes ago linux/amd64 1.121GB 494.9MB
vitess/lite latest e3a3ff311b0a 9 minutes ago linux/amd64 2.352GB 734.1MB
ubuntu latest 99c35190e22d 12 minutes ago linux/amd64 87.56MB 29.75MB
ubuntu jammy 0e5e4a57c249 14 minutes ago linux/amd64 87.51MB 29.54MB
nvcr.io/nvidia/tritonserver 24.10-vllm-python-py3 6c9dcf2dbe0d 23 minutes ago linux/amd64 21.51GB 13.43GB
(base) ubuntu@ip-10-0-17-13:~$ sudo nerdctl image convert --oci --zstdchunked nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3 nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3-zstdchunked
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:f5f79ac10bb874bdbe60f05aefdf89d24c8f07b24910dbd787b9ee4cfd390565 17408 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:176c746bdb5ad24a387e0d855c44bd57391d7c33a2bad8e19d4aced54bea5a00 71680 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:66450b4ef0ee9891dd2b44a9c947bee0db15b50863cc69a80a98a8c74ba7abf8 8704 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:87e1348a15f93372a287356a2c98836d061f33b6bd6d768ef42360e6b5f62630 341504 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef 1024 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:0b6a520db613be9ef2d808547aefba361788a92f82ccaa532fa3b2895f94debc 151040 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:98734bf94d2bbd3b2d3b3032ab41bea0ee1ad76db24ca128ed1d866f3df6ff8b 2560 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:6c75d6484379aa51f50d3e6a3c1f0b7acc2364aed0b9fe643224ce3134c970f3 26112 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:94236b11b2863870accf74d3d01d44e6095f7550e18451e75a1e6ca26642355a 62976 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:b8ff3f71e1363ccb2bf7e69e1fcafc48ea021e939037d2ef6782c0729e114fd1 11264 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:b05947e518f595236486c35a836848c01ccd7c06539a481d2084667e8288ddf4 3584 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef 1024 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:caf07e7743c0eb80a8a7ac78b631cd93b73f96e2d1a1dabe4d9ae7a9b922d24b 3072 [] map[] [] <nil> }"
FATA[0000] ref default/1/convert-zstdchunked-from-sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 locked for 103.633334ms (since 2024-10-31 18:47:12.269280827 +0000 UTC m=+486.719742327): unavailable
To reproduce, pull sudo nerdctl pull nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3
and try to convert it.
So it looks like the issue is just some kind of size limit, because it gets canceled instantly in 100 ms after launching, not breaking or anything.
I hope it helps!
@GrigoryEvko on my side, I did a quick read of the source: looks like Build does require a SectionReader - https://github.com/containerd/stargz-snapshotter/blob/a6b9bdb5a9e113277fa213e002e65bf1a761509c/estargz/build.go#L153 - although it is not clear why that is useful, compared to just a Reader - especially if the content is not compressed - furthermore, there is a fair amount of filesystem back and forth - https://github.com/containerd/stargz-snapshotter/blob/a6b9bdb5a9e113277fa213e002e65bf1a761509c/estargz/build.go#L652 - and finally it seems like after compression, we do decompress the result (same here, not sure why - it appears we want the size of the uncompressed content - https://github.com/containerd/stargz-snapshotter/blob/a6b9bdb5a9e113277fa213e002e65bf1a761509c/nativeconverter/zstdchunked/zstdchunked.go#L166)
Anyhow, there is possibly room for improvement here.
Addendum: estargz.Build does need a ReaderAt for good reasons, although it is not clear how the current implementation could scale to that size (30G+).
Also I've got a bug in nerdctl v2.0.0-rc3 image convert not being able to find images pulled from dockerhub (even with a URL), only from other registries.
I will look into this one.
Also I've got a bug in nerdctl v2.0.0-rc3 image convert not being able to find images pulled from dockerhub (even with a URL), only from other registries.
I will look into this one.
PR incoming for this specifically: https://github.com/containerd/nerdctl/pull/3626
So:
go build -o /tmp/nerdctl_s ./cmd/nerdctl/ && /tmp/nerdctl_s --debug-full image convert --oci --zstdchunked nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3 nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3-zstdchunked
DEBU[0000] stateDir: /run/user/501/containerd-rootless
DEBU[0000] RootlessKit detach-netns mode: true
DEBU[0000] rootless parent main: executing "/usr/bin/nsenter" with [-r/ -w/Users/dmp/Projects/go/nerd/nerdctl --preserve-credentials -m -U -t 1665 -F /tmp/nerdctl_s --debug-full image convert --oci --zstdchunked nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3 nvcr.io/nvidia/tritonserver:24.10-vllm-python-py3-zstdchunked]
DEBU[0000] using igzip for decompression
DEBU[0000] zstdchunked: uncompressed sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 into sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef
DEBU[0000] zstdchunked: uncompressed sha256:6da051397311aff5f9d1d8b5c75afa073318d65f9251b7665e909a1066c809f9 into sha256:caf07e7743c0eb80a8a7ac78b631cd93b73f96e2d1a1dabe4d9ae7a9b922d24b
DEBU[0000] zstdchunked: uncompressed sha256:e6f5e18001c21008ddf1f80699663abbce296ae8311b9bd76e39f63ad746ec43 into sha256:f5f79ac10bb874bdbe60f05aefdf89d24c8f07b24910dbd787b9ee4cfd390565
DEBU[0000] zstdchunked: uncompressed sha256:5738d44ce3f25fe9275b54fd5e24d0b26d6b404ffa57de69735d51778224afe8 into sha256:0b6a520db613be9ef2d808547aefba361788a92f82ccaa532fa3b2895f94debc
DEBU[0000] zstdchunked: uncompressed sha256:86abba0172c5ca2b6660f29e0a9e9602bfe45b42ec16e9e2d29d516f6ab20373 into sha256:176c746bdb5ad24a387e0d855c44bd57391d7c33a2bad8e19d4aced54bea5a00
DEBU[0000] zstdchunked: uncompressed sha256:c2ad6da399bae2b3351c82d04f0d0ef4139a834390e551e516cb1fba74f97df2 into sha256:6c75d6484379aa51f50d3e6a3c1f0b7acc2364aed0b9fe643224ce3134c970f3
DEBU[0000] zstdchunked: uncompressed sha256:9ded5c3415695be1951b6f058e20d3363003b53a2658340c5f15d81856ad0e98 into sha256:b05947e518f595236486c35a836848c01ccd7c06539a481d2084667e8288ddf4
DEBU[0000] zstdchunked: uncompressed sha256:b9b0caed1c8c12f12dfad67a5ea1d7432c4271672cb1de4270c4867347a937f8 into sha256:94236b11b2863870accf74d3d01d44e6095f7550e18451e75a1e6ca26642355a
DEBU[0000] zstdchunked: uncompressed sha256:c9cc852679cb7fe38fcf5665287edaa9d99f14b1e1cec65f67cdefd53ff2f9e0 into sha256:87e1348a15f93372a287356a2c98836d061f33b6bd6d768ef42360e6b5f62630
DEBU[0000] zstdchunked: uncompressed sha256:4790d1bdaaa8b59802f26526c2adac543f9fa7bf765f340e7e981e0a4f845d54 into sha256:98734bf94d2bbd3b2d3b3032ab41bea0ee1ad76db24ca128ed1d866f3df6ff8b
DEBU[0000] zstdchunked: uncompressed sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 into sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef
DEBU[0000] zstdchunked: uncompressed sha256:8c975606e87f18f1f237f9bfe68b58786e20d518a493d96780e73a4dcc408a21 into sha256:66450b4ef0ee9891dd2b44a9c947bee0db15b50863cc69a80a98a8c74ba7abf8
DEBU[0000] zstdchunked: uncompressed sha256:34c2dbdcbc81ceac35887d59172aa654fd08bae28a3c41ad238152170c73ae91 into sha256:b8ff3f71e1363ccb2bf7e69e1fcafc48ea021e939037d2ef6782c0729e114fd1
DEBU[0000] zstdchunked: uncompressed sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 into sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:b8ff3f71e1363ccb2bf7e69e1fcafc48ea021e939037d2ef6782c0729e114fd1 11264 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:66450b4ef0ee9891dd2b44a9c947bee0db15b50863cc69a80a98a8c74ba7abf8 8704 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:caf07e7743c0eb80a8a7ac78b631cd93b73f96e2d1a1dabe4d9ae7a9b922d24b 3072 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:176c746bdb5ad24a387e0d855c44bd57391d7c33a2bad8e19d4aced54bea5a00 71680 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef 1024 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:98734bf94d2bbd3b2d3b3032ab41bea0ee1ad76db24ca128ed1d866f3df6ff8b 2560 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:6c75d6484379aa51f50d3e6a3c1f0b7acc2364aed0b9fe643224ce3134c970f3 26112 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:87e1348a15f93372a287356a2c98836d061f33b6bd6d768ef42360e6b5f62630 341504 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef 1024 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:b05947e518f595236486c35a836848c01ccd7c06539a481d2084667e8288ddf4 3584 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:f5f79ac10bb874bdbe60f05aefdf89d24c8f07b24910dbd787b9ee4cfd390565 17408 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:94236b11b2863870accf74d3d01d44e6095f7550e18451e75a1e6ca26642355a 62976 [] map[] [] <nil> }"
WARN[0000] failed to remove tmp uncompressed layer error="context canceled" uncompressedDesc="&{application/vnd.docker.image.rootfs.diff.tar sha256:0b6a520db613be9ef2d808547aefba361788a92f82ccaa532fa3b2895f94debc 151040 [] map[] [] <nil> }"
FATA[0000] ref default/1/convert-zstdchunked-from-sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 locked for 23.340521ms (since 2024-10-31 22:29:51.884491233 +0100 CET m=+28191.745890060): unavailable
4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
is being uncompressed multiple times, concurrently:
I will assume this is safe to do (albeit it seems wasteful), but more importantly, we defer deletion of the uncompressed layer inside the converter function, which likely may happen before other calls for the same desc are done.
This looks like the culprit ^.
Using mutexes to prevent the same desc to be processed in parallel does the trick - although this is ugly.
nerdctl images
REPOSITORY TAG IMAGE ID CREATED PLATFORM SIZE BLOB SIZE
nvcr.io/nvidia/tritonserver 24.10-vllm-python-py3-zstdchunked ba4109a4c485 22 seconds ago linux/amd64 0B 12.26GB
nvcr.io/nvidia/tritonserver 24.10-vllm-python-py3 6c9dcf2dbe0d 2 hours ago linux/amd64 21.51GB 13.43GB
I sent a PR here on nerdctl, but I am not convinced this is enough and there may be more issues at play here (containerd gc-ing layers?). Furthermore, it should probably go to stargz instead.
@ktock feel free to carry the PR over or use the info here to write a better patch on stargz.
Ohh, great!! Thank you so much!
Can we instead decompress all layers only once? I think it's how it was supposed to be implemented and some concurrency leaked into it.
Anyway really impressed by the quick fix! Thanks again😄
Ohh, great!! Thank you so much!
Can we instead decompress all layers only once? I think it's how it was supposed to be implemented and some concurrency leaked into it.
Anyway really impressed by the quick fix! Thanks again😄
I tried a few approaches - notably, with a ref counter and storing the uncompressed desc in the map. The extra complexity is not worth it IMHO. Furthermore, we are constrained by containerd methods design.
Anyhow, peeps at stargz will probably have better ideas than me on this.
Description
The issues seems to be related to the code imported from stargz-snapshotter, because it persists across both nerdctl and stargz-snapshotter packages. (reported here https://github.com/containerd/stargz-snapshotter/issues/1842)
Using both packages (nerdctl with
nerdctl image convert --zstdchunked --oci src target
) results in similar errors:Also on some steps I ended up with containerd bug present in latest 1.7.22 version, but not in 2.0 rc it seems, it was reported multiple times recently. Is it just me? I'm using Ubuntu 22.04 with containerd installed from apt repo (but tried several tarball releases, nerdctl from latest release, stargz-snapshotter both from release and built from latest git repo).
Also it seems related to the recent fix in nerdctl https://github.com/containerd/nerdctl/pull/3079 , because zstdchunked converter looks almost identical to pre pull request nerdctl code.
Thanks in advance! Maybe I need to use dockerized environment for conversion? Maybe it's my giant (30GB) image? I use heavy ML docker with many python packages in layers and some models in layers as well, because it's the image for kubernetes autoscaling deployment with registry storage on s3.
Steps to reproduce the issue
No response
Describe the results you received and expected
Zstdchunked converter working as expected
What version of nerdctl are you using?
v1.7.7
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
No response