k3s-io / k3s

Lightweight Kubernetes
https://k3s.io
Apache License 2.0
27.83k stars 2.33k forks source link

Issue pulling only some images #2455

Closed calcium90 closed 3 years ago

calcium90 commented 3 years ago

Environmental Info: K3s Version: k3s version v1.19.3+k3s1 (974ad30b)

Node(s) CPU architecture, OS, and Version: Linux devkubewkr04 5.3.18-lp152.47-default rancher/k3s#1 SMP Thu Oct 15 16:05:25 UTC 2020 (41f7396) x86_64 x86_64 x86_64 GNU/Linux

OpenSUSE Leap 15.2 Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

Cluster Configuration: 3 masters (OpenSUSE Leap 15.2 virtual machine), 3 workers (OpenSUSE Leap 15.2 physical server)

Describe the bug: Some container images cannot be pulled and throw an error:

# crictl pull dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest
FATA[2020-10-29T11:22:19.246797520Z] pulling image: rpc error: code = InvalidArgument desc = failed to pull and unpack image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest": failed to prepare extraction snapshot "extract-245403249-iFq5 sha256:90e6fdc24f3e75b50ddcf1306ecec1b2acb8469d17b14f560d412fc9976f98ea": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument

Others do not experience an issue:

# crictl pull dockerregistry.xxx.co.uk:5000/xxx/calkube:latest
Image is up to date for sha256:27b05fcb50cc06674c654a3d1489d50e3471de6f4db41fa276b3b33eda0a5254

This happens on all nodes in the cluster, master and worker.

Steps To Reproduce:

Expected behavior: Pods are able to pull any specified image and run them.

Actual behavior: Pods are not able to pull some images and cannot start.

Additional context / logs: kubectl describe from an affected pod:

  Warning  Failed   69s                kubelet, sodevkubewkr04  Failed to pull image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest": rpc error: code = InvalidArgument desc = failed to pull and unpack image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest": failed to prepare extraction snapshot "extract-627202121-F4iI sha256:90e6fdc24f3e75b50ddcf1306ecec1b2acb8469d17b14f560d412fc9976f98ea": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument
  Warning  Failed   54s                kubelet, sodevkubewkr04  Failed to pull image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest": rpc error: code = InvalidArgument desc = failed to pull and unpack image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest": failed to prepare extraction snapshot "extract-806714390-_-Xo sha256:90e6fdc24f3e75b50ddcf1306ecec1b2acb8469d17b14f560d412fc9976f98ea": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument
  Normal   BackOff  43s (x2 over 68s)  kubelet, sodevkubewkr04  Back-off pulling image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest"
brandond commented 3 years ago

According to the log message, the image in question has an excessively long label:

failed to pull and unpack image "dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest": failed to prepare extraction snapshot "extract-245403249-iFq5 sha256:90e6fdc24f3e75b50ddcf1306ecec1b2acb8469d17b14f560d412fc9976f98ea": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument

Can you check the labels on this image and confirm that none of them violate this limit? Is there a public copy of the image with the same labels that I can look at?

calcium90 commented 3 years ago

Hi Brandon,

The image is a private one but I can provide the labels from docker inspect:

            "Labels": {
                "org.openbuildservice.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
                "org.opencontainers.image.created": "2020-06-22T18:36:27.710787637Z",
                "org.opencontainers.image.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
                "org.opencontainers.image.title": "openSUSE Leap 15.1 Base Container",
                "org.opencontainers.image.url": "https://www.opensuse.org/",
                "org.opencontainers.image.vendor": "openSUSE Project",
                "org.opencontainers.image.version": "15.1.3.164",
                "org.opensuse.base.created": "2020-06-22T18:36:27.710787637Z",
                "org.opensuse.base.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
                "org.opensuse.base.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
                "org.opensuse.base.reference": "registry.opensuse.org/opensuse/leap:15.1.3.164",
                "org.opensuse.base.title": "openSUSE Leap 15.1 Base Container",
                "org.opensuse.base.url": "https://www.opensuse.org/",
                "org.opensuse.base.vendor": "openSUSE Project",
                "org.opensuse.base.version": "15.1.3.164",
                "org.opensuse.reference": "registry.opensuse.org/opensuse/leap:15.1.3.164"
            }

Those seem to be picked up from the base image which is opensuse/leap:15.1 (below is slightly newer but the labels are essentially the same:

            "Labels": {
                "org.openbuildservice.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
                "org.opencontainers.image.created": "2020-10-25T15:27:05.553755960Z",
                "org.opencontainers.image.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
                "org.opencontainers.image.title": "openSUSE Leap 15.1 Base Container",
                "org.opencontainers.image.url": "https://www.opensuse.org/",
                "org.opencontainers.image.vendor": "openSUSE Project",
                "org.opencontainers.image.version": "15.1.3.210",
                "org.opensuse.base.created": "2020-10-25T15:27:05.553755960Z",
                "org.opensuse.base.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
                "org.opensuse.base.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
                "org.opensuse.base.reference": "registry.opensuse.org/opensuse/leap:15.1.3.210",
                "org.opensuse.base.title": "openSUSE Leap 15.1 Base Container",
                "org.opensuse.base.url": "https://www.opensuse.org/",
                "org.opensuse.base.vendor": "openSUSE Project",
                "org.opensuse.base.version": "15.1.3.210",
                "org.opensuse.reference": "registry.opensuse.org/opensuse/leap:15.1.3.210"
            }

Nothing stands out to me as being unusual here and I can indeed use the opensuse/leap:15.1 image and others that are built on it in the cluster just fine.

brandond commented 3 years ago

The interesting thing about the error message is that it shows an empty key for the label. I don't see any empty keys in the output you shared. The message is coming from here: https://github.com/rancher/k3s/blob/v1.19.3+k3s1/vendor/github.com/containerd/containerd/labels/validate.go#L34

Can you try a tool like skopeo to inspect the image directly in the registry?

skopeo inspect docker://dockerregistry.xxx.co.uk:5000/xxx/r-session-base:latest --config | jq '.config.Labels'
calcium90 commented 3 years ago

Here's the output using skopeo:

~> skopeo inspect docker://dockerregistry.x.co.uk:5000/x/r-session-base:latest --config | jq '.config.Labels'
{
  "org.openbuildservice.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
  "org.opencontainers.image.created": "2020-06-22T18:36:27.710787637Z",
  "org.opencontainers.image.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
  "org.opencontainers.image.title": "openSUSE Leap 15.1 Base Container",
  "org.opencontainers.image.url": "https://www.opensuse.org/",
  "org.opencontainers.image.vendor": "openSUSE Project",
  "org.opencontainers.image.version": "15.1.3.164",
  "org.opensuse.base.created": "2020-06-22T18:36:27.710787637Z",
  "org.opensuse.base.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
  "org.opensuse.base.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
  "org.opensuse.base.reference": "registry.opensuse.org/opensuse/leap:15.1.3.164",
  "org.opensuse.base.title": "openSUSE Leap 15.1 Base Container",
  "org.opensuse.base.url": "https://www.opensuse.org/",
  "org.opensuse.base.vendor": "openSUSE Project",
  "org.opensuse.base.version": "15.1.3.164",
  "org.opensuse.reference": "registry.opensuse.org/opensuse/leap:15.1.3.164"
}

So there doesn't seem to be a difference there from the docker inspect. It's most puzzling as we don't add our own labels at any point, so all of these are straight from SUSE.

I'll give more background to the image, it actually is the result of several Dockerfiles building from each other. Everything from 'suse-base' onwards is ours and in our private repo:

opensuse/leap:15.1 -> suse-base -> r-base -> r-packages -> r-generic -> r-session-base

I was concerned it was some complication of having such a hierarchy but I've traced the issue all the way up to r-base, so it does not happen with suse-base but does with r-base and everything built from that. Here is the Dockerfile for r-base:

FROM dockerregistry.x.co.uk:5000/x/suse-base:latest

## System packages installation
RUN zypper --non-interactive ref
RUN zypper --non-interactive in gcc gcc-c++ gcc-fortran
RUN zypper --non-interactive in gcc7 gcc7-c++ gcc7-fortran libgfortran4
RUN zypper --non-interactive in --force-resolution gcc8 gcc8-fortran gcc8-c++
RUN zypper --non-interactive in pandoc
RUN zypper --non-interactive in libicu-devel libcurl-devel zlib-devel libzip-devel
RUN zypper --non-interactive in R-base-devel
RUN zypper --non-interactive in openssl libopenssl-devel openssh
RUN zypper --non-interactive in blaze3.3-devel
RUN zypper --non-interactive in blaze3.4-devel
RUN zypper --non-interactive in blaze3.5-devel

RUN zypper --non-interactive in --force-resolution boost_1_66-devel libboost_python-py2_7-1_66_0-devel libboost_python-py3-1_66_0-devel boost_numeric_bindings-devel
RUN zypper --non-interactive in libboost_numpy-py3-1_66_0-devel
RUN zypper --non-interactive in python3-devel python3-pip python3-virtualenv
RUN zypper --non-interactive in --force-resolution libssh2-1 libssh2-1-32bit libssh2-devel cyrus-sasl-devel libxslt-devel
RUN zypper --non-interactive in libxml2-2 libxml2-2-32bit libxml2-devel libxml2-tools python-libxml2
RUN zypper --non-interactive in freetds-devel libQt5Sql5-unixODBC libtdsodbc0 mysql-connector-odbc unixODBC unixODBC-32bit unixODBC-devel
RUN zypper --non-interactive in cmake cmake_modules make
RUN zypper --non-interactive in python-devel python-pip
RUN zypper --non-interactive in hunspell hunspell-tools ispell ispell-british ispell-american
RUN zypper --non-interactive in myspell-dictionaries myspell-en myspell-en_GB
RUN zypper --non-interactive in aspell aspell-en aspell-devel ispell ispell-british ispell-american
RUN zypper --non-interactive in lapack-devel liblapack3 cblas-devel blas-devel libblas3
RUN zypper --non-interactive in libopenblas_serial-devel libopenblas_serial0 openblas-devel openblas-devel-headers
RUN zypper --non-interactive in java-1_8_0-openjdk
RUN zypper --non-interactive in java-11-openjdk java-11-openjdk-devel java-11-openjdk-headless
RUN zypper --non-interactive in libMagick++-7_Q16HDRI4 libMagickCore-7_Q16HDRI6 libMagickCore-7_Q16HDRI6 ImageMagick libMagick++-devel
RUN zypper --non-interactive in pbzip2
RUN zypper --non-interactive in texlive-framed texlive-pdftex texlive-pdftex-bin texlive-pdftex-fonts texlive-latex-bin-bin texlive-inconsolata texinfo texlive-collection-fontsrecommended texlive-collection-latexrecommended texlive-xetexconfig texlive-xetex-bin texlive-xetex texlive-ifxetex texlive-mathspec
RUN zypper --non-interactive in curl libcurl-devel
RUN zypper --non-interactive in libpoppler-devel
RUN zypper --non-interactive in qpdf qpdf-devel
RUN zypper --non-interactive in lftp
RUN zypper --non-interactive in udunits2-devel gmp-devel gdal libproj-devel geos-devel libapparmor-devel libgit2-devel
RUN zypper --non-interactive in gzip
RUN zypper --non-interactive in glu-devel libQt5OpenGL5
RUN zypper --non-interactive --no-gpg-checks in google-chrome-stable
RUN zypper --non-interactive --no-gpg-checks in ghc-pandoc-citeproc-devel
RUN zypper --non-interactive in freetype2-devel libfreetype6 libfreetype6-32bit
RUN zypper --non-interactive in libtiff-devel libtiff5 libtiff5-32bit libgeotiff2
RUN zypper --non-interactive in fribidi-devel
RUN zypper --non-interactive in harfbuzz-devel

#for Rfast
RUN zypper --non-interactive in gsl-devel

# for rJava
RUN zypper --non-interactive in pcre2-devel
RUN R CMD javareconf

# Install Julia
RUN zypper --non-interactive in julia julia-devel

## change blas and cblas to openblas
RUN update-alternatives --set libblas.so.3 /usr/lib64/libopenblas_serial.so.0
RUN update-alternatives --set libcblas.so.3 /usr/lib64/libopenblas_serial.so.0
RUN update-alternatives --set liblapack.so.3 /usr/lib64/libopenblas_serial.so.0

## add R config
RUN mkdir -p /root/.R
COPY Makevars /root/.R/Makevars
COPY Rprofile /root/.Rprofile
COPY Renviron /root/.Renviron
COPY odbc.ini /root/.odbc.ini

And here's the skopeo output for that:

~> skopeo inspect docker://dockerregistry.x.co.uk:5000/x/r-base:latest --config | jq '.config.Labels'
{
  "org.openbuildservice.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
  "org.opencontainers.image.created": "2020-06-22T18:36:27.710787637Z",
  "org.opencontainers.image.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
  "org.opencontainers.image.title": "openSUSE Leap 15.1 Base Container",
  "org.opencontainers.image.url": "https://www.opensuse.org/",
  "org.opencontainers.image.vendor": "openSUSE Project",
  "org.opencontainers.image.version": "15.1.3.164",
  "org.opensuse.base.created": "2020-06-22T18:36:27.710787637Z",
  "org.opensuse.base.description": "Image containing a minimal environment for containers based on openSUSE Leap 15.1.",
  "org.opensuse.base.disturl": "obs://build.opensuse.org/openSUSE:Leap:15.1:Images/images/740264e3294afe7ca32a3ea9deb863d2-opensuse-leap-image:docker",
  "org.opensuse.base.reference": "registry.opensuse.org/opensuse/leap:15.1.3.164",
  "org.opensuse.base.title": "openSUSE Leap 15.1 Base Container",
  "org.opensuse.base.url": "https://www.opensuse.org/",
  "org.opensuse.base.vendor": "openSUSE Project",
  "org.opensuse.base.version": "15.1.3.164",
  "org.opensuse.reference": "registry.opensuse.org/opensuse/leap:15.1.3.164"
}
calcium90 commented 3 years ago

I've reproduced the same problem on Ubuntu 18.04 in my home lab on k3s v1.19.3, but it does not happen on k3s v1.18.10 where I can pull the image with no issues. The containerd version on k3s v1.18.10 is 1.3.3-k3s2 and 1.19.3 is on containerd 1.4.0-k3s1 so perhaps something is going on here, or there is a breaking change I'm not aware of.

brandond commented 3 years ago

Can you make available the image you reproduced with in your home lab?

calcium90 commented 3 years ago

I couldn't make that image available but I've now replicated the issue by building an image from opensuse/leap:15.1, I actually ran into the issue when trying to replicate what we were doing in our builds, one thing that stood out was the large number of RUN commands so I ended up making a Dockerfile like this:

FROM opensuse/leap:15.1
RUN zypper --non-interactive ref
RUN touch /tmp/test1
RUN touch /tmp/test1
RUN touch /tmp/test1
...
...

(the RUN is repeated 60 times)

It seemed the magic number was about 60 RUN lines, at which point the error is seen which you should be able to replicate yourself now:

# crictl pull cjp2k20/test
FATA[2020-11-02T18:00:47.928115287Z] pulling image: rpc error: code = InvalidArgument desc = failed to pull and unpack image "docker.io/cjp2k20/test:latest": failed to prepare extraction snapshot "extract-926560404-qhh0 sha256:42add5a4c91d6ec9dce019f7675367ea9736d9a4156cae704bb068865e1f45ac": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument

Is there a limit of number of layers in containerd I'm not aware of?

brandond commented 3 years ago

I don't believe there is any such limit. This will probably require a fix to containerd upstream, if it hasn't already been resolved.

brandond commented 3 years ago

OK, so I added some debug prints to the containerd code, and it looks like it injects some labels into the image - including one that is a comma-separated list of layers, apparently as a way to pass the layer list to the snapshotter implementation: https://github.com/rancher/k3s/blob/master/vendor/github.com/containerd/cri/pkg/server/image_pull.go#L472

This would obviously present a problem if you have a bunch of RUN commands, since each command creates a layer.

level=info msg="Validating label key=\"containerd.io/gc.ref.content.l.5\" value=\"sha256:ec50b3897e653ab6cbf0013529321eb469725debc7859bc6ea24de3fa2889ddd\" len=103"
level=info msg="Validating label key=\"containerd.io/distribution.source.docker.io\" value=\"cjp2k20/test\" len=55"
level=info msg="Validating label key=\"containerd.io/gc.ref.content.l.22\" value=\"sha256:6c88863eba12098f17bec7bdf5923f0f1ef6fdeb106141ec388f7f3ea2e9b8ac\" len=104"
level=info msg="Validating label key=\"containerd.io/distribution.source.docker.io\" value=\"cjp2k20/test\" len=55"
level=info msg="Validating label key=\"containerd.io/distribution.source.docker.io\" value=\"cjp2k20/test\" len=55"
level=info msg="Validating label key=\"containerd.io/snapshot/cri.layer-digest\" value=\"sha256:76bc64d3eb135cd64743a6cb8cd076ad4f464e9db2f4d27e660d0ca523afd5d2\" len=110"
level=info msg="Validating label key=\"containerd.io/snapshot/cri.image-layers\" value=\"sha256:76bc64d3eb135cd64743a6cb8cd076ad4f464e9db2f4d27e660d0ca523afd5d2,sha256:2276c01aa4f235cd926b4bf9e820de7a6d72dc5f9d7afba898d90346c5d0608e,sha256:eee111f60e4b218527a676f37b8d82b8746d2568caebf1ae7397fb7981a1fce3,sha256:eee111f60e4b218527a676f37b8d82b8746d2568caebf1ae7397fb7981a1fce3,sha256:bf0b3a48539e3895c65d14d64178a2ba7cb778fe34deccabaf652f58e73ed37b,sha256:ec50b3897e653ab6cbf0013529321eb469725debc7859bc6ea24de3fa2889ddd,sha256:5fdcfdc920630372085cf333ecc110613f40ddadd0f7b3aea9894857b957ee3c,sha256:0983687e97ff2031fe268b41d7e7bbb40bac7a37907efc052a24908cdd36ffc3,sha256:c37b79910e6083085f5ad7948dec984048b8f032e84a5f425add0d0896289b23,sha256:6e18c4f0dded192e8ee3a13efb3f8e95ec44a71850ec44256e434ffdf275458c,sha256:1787c5686fad663e894e884cbb101bbe14ed72bb16e45faae19c26d426236b19,sha256:862e20d856e252e3d87acf209489df35bc89187b624ed05c842335314c5b9874,sha256:e8f91369c66ec52a2c56925d87618ea6d06745b6a34606e07c9e11f97ce75710,sha256:54c0903436faeeb9bb5f5384b7c8758ce64d79c1d902a98716276c19821829ea,sha256:b98495e57157c50de57a179548bfe364f4348c1c1366de6fb48cd8d0d37088d7,sha256:4354b0961161346954b70a80d8c132664d43c006cac28aac6c3c3850391b860a,sha256:0c2cae59d7bf4272b510f850ab6204f146cff698018d2d4bff9fd905a86eb8b0,sha256:da53ff47cce20c82937f57941a0ade9c03cf391a61234cd41ed92cc75ddd44bf,sha256:df8d1f90eb6e3515edd8dbf61d9ad1cf54419dd8827eff539578d35ae96bb015,sha256:7b212b1bcf1926e6b0dc56a982e7f3fdb56caa6bb345ab32ca017bc558bad9f9,sha256:856934484ef0f31c7211cf2ac2b461366bb42c85c444bbab1623fa528e23ef23,sha256:8b31695df9aea82102b30a7668d7a5e4c067790b10ce148d95b2f61e8b014b8a,sha256:6c88863eba12098f17bec7bdf5923f0f1ef6fdeb106141ec388f7f3ea2e9b8ac,sha256:c8ff28d1bf3dff30bc5294b57b762099040371a9a486940b8e1188997934e99f,sha256:e09f4e5acbac7dc9d3aef6c56d7b140e5137a14635aa4baef125a2a8d7d54483,sha256:258a01c7b7ce7d68c97c62b7cd3c3c6810ee48fc8addf3612a3686b671e0e5a0,sha256:c2f750a156c3f7fe0e5636eb19dab199bb7b75ccbe0dd40f976b165bdce88879,sha256:f7a03feeb19fbc38e58da934e3a63b4c354c49566e5e9975d8daadb78e59c338,sha256:5c9f3363104a3934ab33806ab425b41bb70f6e307e04547bfdb348fcdaae02d5,sha256:a90794775d2b26efe1ac46bd34cc44e75c104004f99222708f2f505019d66e77,sha256:48bb3a8f24530de6239486b6c0dc6c89154d9e7736d7a128788e1d1330a174a9,sha256:ed8f0fe8e54598364cf7a0678237b96c338ebdc946a622f60cabfea8e6e8c4b3,sha256:9585d6900d308f1a2f6998bab068217d05dd4d684fb67439f2d2ca49d5fbb497,sha256:72e5b5046673ed8bfc068a74d36dcee781e0bad75d5fa8b65fcb6d7cd1652960,sha256:f80525339f5607442fcdec9f7e2af72822109352ecb9d9077b2e19dda5ff24be,sha256:2b09c14167fe7fadec94099c5a002bfdc121d7d721935ec8d86af53acbf7f4b1,sha256:dd4c8d72c08c42dac93f024695f39f306d92cfaa5aafe3d2360d34c8daae6c61,sha256:25782c4e431afbd20f6ffe46554ee1ab8eed622bf958cb2b60c092e5d300dbe3,sha256:c02cb30c1153fac02a3f99f7129e25b73f01679236d495e6974bdc1e3d595f70,sha256:c02cb30c1153fac02a3f99f7129e25b73f01679236d495e6974bdc1e3d595f70,sha256:7deedbbee44018815e61298d93c8ab6c1afe70c7a1a73d1f7f85e747a3018ccd,sha256:e5e4bf80304e34c9414f0faeaa96a8c1a9341776f30030052e9049e80015b246,sha256:d2ab38929f28c273b842e5c376e90749048eef69eeec56f0822b9dc0a6da7051,sha256:7b3c48608c0fb19916672e6c108c83f37f4776f1a8564eab0012c0dedfb4253f,sha256:11c1f0badd2b399d9318b9c803a6637ce073c59d5adc8bc0990b18fad17f0efa,sha256:83e5e37510dda12ede81e042b2741546d2dbf5bb6564ea092701e213db43439c,sha256:83e5e37510dda12ede81e042b2741546d2dbf5bb6564ea092701e213db43439c,sha256:2f94495da9876c574e6395b7e63b7dd6dedff21c5d6d0e034713f63e0dd9bea0,sha256:121ade8995eda5ee503ffd19c6e09540ecf2624f546674941b939c160a535abe,sha256:bda7a0d2dec7d51993399c0cd70e0088f17fce1d42706f2c4b7bef307255d634,sha256:8d25ebd04f981d81e24206afe8bc89c63daec90e3e2d342f01c3460f4b616c5e,sha256:e1925dbd542e45ee40e9c69685a7f9a4b462c6133a98d9bb49c2ad244acfc8e4,sha256:0938e4f0669ccf3cec297aa1c2f1a0b1e551435975cb3aba8d91a9e418d47176,sha256:74cb6a36aedfdb8e0e419ec57736e06ce36805ea341f301779573d1a178237a0,sha256:2ecf817660ce8ebe66143f7fbd3082890b8f7eace10ae92a45c978f9cc347b53,sha256:3c0d6a692cf2e61ff3918e7149b664ce9e5e25fdb3be2a3c76f7124c378d8bbb,sha256:e4252c0146e11218c98853f2ba4fdef6733ebbac49af3eb437910b1e15317fdf,sha256:ee135058de93ca886323cea2115e9397325a8b0dfd820f0175ccd8ab27e83c60,sha256:4592e15f81b7634679a85a71b27e14823ba360deb68af0d1cab0d6bcb99dd757,sha256:6379eee45220a9a6638ea1460678602349248c29d6681734cdf5135175ea5d50,sha256:9f87e2c15173e03ed57d08c1bed6c2f7e1d60f59cf20cce782241a4c506c74b0\" len=4430"
level=error msg="PullImage \"docker.io/cjp2k20/test:latest\" failed" error="rpc error: code = InvalidArgument desc = failed to pull and unpack image \"docker.io/cjp2k20/test:latest\": failed to prepare extraction snapshot \"extract-184956284-yQG1 sha256:42add5a4c91d6ec9dce019f7675367ea9736d9a4156cae704bb068865e1f45ac\": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument"

This should probably be handled better - in particular, the label key is truncated to 10 characters for the log message which makes it really difficult to figure out what's going on. I'll take a look at opening an issue upstream, but I imagine their immediate recommendation will be to follow best practices and minimize the number of image layers by combining your RUN commands.

brandond commented 3 years ago

As per https://github.com/containerd/containerd/issues/4684#issuecomment-720696349 these labels are in support of an experimental feature and should not have been enabled by default; the next release of containerd will turn them off. In the mean time we can probably update our containerd config.toml template to set disable_snapshot_annotations = true in the plugins.cri.containerd section, which is effectively what upstream will do.

ShylajaDevadiga commented 3 years ago

Reproduced the issue in v1.19.3+k3s2 and validated the fix using commit ID c72c1867, created an image with the example above.

k3s -v
k3s version v1.19.3+k3s2 (f8a4547b)

sudo crictl pull shylajarancher19/imagewith60layers
FATA[2020-11-09T22:21:09.253900853Z] pulling image: rpc error: code = InvalidArgument desc = failed to pull and unpack image "docker.io/shylajarancher19/imagewith60layers:latest": failed to prepare extraction snapshot "extract-251938186-Su43 sha256:78fd1be2ec15dc4d991a55bd6d42b7e815e0a43e43ed7c255edc0695cdbd5f77": info.Labels: label key and value greater than maximum size (4096 bytes), key: containerd: invalid argument

k3s -v
k3s version v1.19.3+k3s-c72c1867 (c72c1867)

sudo crictl pull shylajarancher19/imagewith60layers
Image is up to date for sha256:0903344922cc4dc6253c31367cfda9c9f77e8a43d3fba836cae8378b4bd2d8d9
ShylajaDevadiga commented 3 years ago

While the issue is fixed, the location of config.toml has moved from /var/lib/rancher/k3s/agent/etc/containerd/config.toml to /var/lib/rancher/k3s/etc/containerd/config.toml which is unexpected.

sudo cat /var/lib/rancher/k3s/etc/containerd/config.toml

[plugins.opt]
  path = "/var/lib/rancher/k3s/containerd"

[plugins.cri]
  stream_server_address = "127.0.0.1"
  stream_server_port = "10010"
  enable_selinux = false
  sandbox_image = "docker.io/rancher/pause:3.1"

[plugins.cri.containerd]
  disable_snapshot_annotations = true
  snapshotter = "overlayfs"
davidnuzik commented 3 years ago

@ShylajaDevadiga create a separate issue for that.