confidential-containers / guest-components

Confidential Containers Guest Tools and Components
Apache License 2.0
81 stars 93 forks source link

A timeout error when pull encrypt image in image-rs 0.6.0 #535

Open ZhangLimengLimeng opened 6 months ago

ZhangLimengLimeng commented 6 months ago
Apr 08 13:52:26 ra kata[17317]: time="2024-04-08T13:52:26.956783594+08:00" level=warning msg="qemu-systex86_64: KVM_GET_DEVICE_ATTR(0, KVM_X86_XCOMP_GUEST_SUPP) error: -22" name=containerd-shim-v2 pid=17317 quPid=17331 sandbox=792b0da6355f60a8f18002a3cf03e828113821fec10a83da8fbae6d7a10075ec source=virtcontainerhypervisor subsystem=qemu
Apr 08 13:52:30 ra kata[17317]: time="2024-04-08T13:52:30.269987287+08:00" level=warning msg="qemu-systex86_64: 9p: degraded performance: a reasonable high msize should be chosen on client/guest side (chosen ize is <= 8192). See https://wiki.qemu.org/Documentation/9psetup#msize for details." name=containerd-shiv2 pid=17317 qemuPid=17331 sandbox=792b0da6355f60a8f18002a3cf03e828113821fec10a83da8fbae6d7a10075ec sour=virtcontainers/hypervisor subsystem=qemu
Apr 08 13:57:30 ra kata[17317]: time="2024-04-08T13:57:30.481887902+08:00" level=error msg="agent pull ige err. context deadline exceeded" name=containerd-shim-v2 pid=17317 sandbox=792b0da6355f60a8f18002a3cf0828113821fec10a83da8fbae6d7a10075ec source=virtcontainers subsystem=kata_agent
Apr 08 13:57:30 ra kata[17317]: time="2024-04-08T13:57:30.481950905+08:00" level=error msg="kata runtimeullImage err. context deadline exceeded" name=containerd-shim-v2 pid=17317 sandbox=792b0da6355f60a8f18003cf03e828113821fec10a83da8fbae6d7a10075ec source=containerd-kata-shim-v2

Because of some computer environment, I can only deploy CC versions below 0.5.0, and I'm facing a timeout issue. I've created an encrypted image, about 4GB in size, and sent it to the hub. My internet speed is slow, so it's taking longer to pull the image. During the process of pulling the encrypted image, I encountered a timeout error, The logs show that it times out after about 5 minutes. I've already updated image-rs to version 0.6.0 (which fixd an issue with pulling large images in update #146), and I've also tried setting runtimeRequestTimeout in the k8s configuration file, but unfortunately, neither had any effect.

Where can I modify the 5 minutes timeout for pulling encrypted images?

Xynnn007 commented 6 months ago

What platform are you using? If the runtime class is tdx-qemu, we can set

# Image request timeout in seconds.
# If specified, indicates the image request timeout in the guest needed for the workload(s)
# If unspecified then it will be set 60 second(s)
# to reduce image pull failures caused by network problems and quickly obtain request failure information at the same time.
image_request_timeout = 60

in /opt/confidential-containers/share/defaults/kata-containers/configuration-qemu-tdx.toml

ZhangLimengLimeng commented 6 months ago

What platform are you using? If the runtime class is tdx-qemu, we can set

# Image request timeout in seconds.
# If specified, indicates the image request timeout in the guest needed for the workload(s)
# If unspecified then it will be set 60 second(s)
# to reduce image pull failures caused by network problems and quickly obtain request failure information at the same time.
image_request_timeout = 60

in /opt/confidential-containers/share/defaults/kata-containers/configuration-qemu-tdx.toml

root@ra:~# kubectl get runtimeclasses
NAME                HANDLER             AGE
kata                kata                28d
kata-qemu           kata-qemu           28d
kata-qemu-csv       kata-qemu-csv       28d
kata-qemu-csv-dcu   kata-qemu-csv-dcu   28d

Thankyou for help. My runtime class Hygon CSV,CC is v0.5.0,image-rs is v0.6.0. I couldn't find the configuration option "image_request_timeout" in “/opt/confidential-containers/share/defaults/kata-containers/configuration-qemu-csv.toml”.

Xynnn007 commented 6 months ago

@ZhangLimengLimeng I am afraid that you need to ask Hygon CSV folks to offer help.

ZhangLimengLimeng commented 6 months ago

@ZhangLimengLimeng I am afraid that you need to ask Hygon CSV folks to offer help.

In the CC v0.5.0 of TDX,like https://github.com/kata-containers/kata-containers/blob/CC-0.5.0/src/runtime/config/configuration-qemu-tdx.toml.in,also couldn't find “image_request_timeout”,and this config add in CC-0.8.0.

Is this timeout duration source code in kata-agent, attestation-agent, image-rs, or somewhere else? I'm trying to change it by modifying the source code. Do you have any suggestions about this?

enuoCM commented 6 months ago

Even CC-0.8.0 has “image_request_timeout” config, but it says: WARNING: All the options in the following section have not been implemented yet. This section was added as a placeholder. DO NOT USE IT!"