k0sproject / k0s

k0s - The Zero Friction Kubernetes
https://docs.k0sproject.io
Other
3.7k stars 362 forks source link

Panic when trying to download Helm chart from OCI registry without a version #5156

Open chattytak opened 6 days ago

chattytak commented 6 days ago

Before creating an issue, make sure you've checked the following:

Platform

Linux 5.14.0-362.24.2.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Mar 30 14:11:54 EDT 2024 x86_64 GNU/Linux NAME="AlmaLinux" VERSION="9.3 (Shamrock Pampas Cat)" ID="almalinux" ID_LIKE="rhel centos fedora" VERSION_ID="9.3" PLATFORM_ID="platform:el9" PRETTY_NAME="AlmaLinux 9.3 (Shamrock Pampas Cat)" ANSI_COLOR="0;34" LOGO="fedora-logo-icon" CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos" HOME_URL="https://almalinux.org/" DOCUMENTATION_URL="https://wiki.almalinux.org/" BUG_REPORT_URL="https://bugs.almalinux.org/"

ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9" ALMALINUX_MANTISBT_PROJECT_VERSION="9.3" REDHAT_SUPPORT_PRODUCT="AlmaLinux" REDHAT_SUPPORT_PRODUCT_VERSION="9.3"

Version

v1.31.1+k0s.1

Sysinfo

`k0s sysinfo`
Total memory: 3.7 GiB (pass)
File system of /var/lib: xfs (pass)
Disk space available for /var/lib/k0s: 26.7 GiB (pass)
Relative disk space available for /var/lib/k0s: 88% (pass)
Name resolution: localhost: [::1 127.0.0.1] (pass)
Operating system: Linux (pass)
  Linux kernel release: 5.14.0-362.24.2.el9_3.x86_64 (pass)
  Max. file descriptors per process: current: 524288 / max: 524288 (pass)
  AppArmor: unavailable (pass)
  Executable in PATH: modprobe: /usr/sbin/modprobe (pass)
  Executable in PATH: mount: /usr/bin/mount (pass)
  Executable in PATH: umount: /usr/bin/umount (pass)
  /proc file system: mounted (0x9fa0) (pass)
  Control Groups: version 2 (pass)
    cgroup controller "cpu": available (is a listed root controller) (pass)
    cgroup controller "cpuacct": available (via cpu in version 2) (pass)
    cgroup controller "cpuset": available (is a listed root controller) (pass)
    cgroup controller "memory": available (is a listed root controller) (pass)
    cgroup controller "devices": available (device filters attachable) (pass)
    cgroup controller "freezer": available (cgroup.freeze exists) (pass)
    cgroup controller "pids": available (is a listed root controller) (pass)
    cgroup controller "hugetlb": available (is a listed root controller) (pass)
    cgroup controller "blkio": available (via io in version 2) (pass)
  CONFIG_CGROUPS: Control Group support: built-in (pass)
    CONFIG_CGROUP_FREEZER: Freezer cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_PIDS: PIDs cgroup subsystem: built-in (pass)
    CONFIG_CGROUP_DEVICE: Device controller for cgroups: built-in (pass)
    CONFIG_CPUSETS: Cpuset support: built-in (pass)
    CONFIG_CGROUP_CPUACCT: Simple CPU accounting cgroup subsystem: built-in (pass)
    CONFIG_MEMCG: Memory Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_HUGETLB: HugeTLB Resource Controller for Control Groups: built-in (pass)
    CONFIG_CGROUP_SCHED: Group CPU scheduler: built-in (pass)
      CONFIG_FAIR_GROUP_SCHED: Group scheduling for SCHED_OTHER: built-in (pass)
        CONFIG_CFS_BANDWIDTH: CPU bandwidth provisioning for FAIR_GROUP_SCHED: built-in (pass)
    CONFIG_BLK_CGROUP: Block IO controller: built-in (pass)
  CONFIG_NAMESPACES: Namespaces support: built-in (pass)
    CONFIG_UTS_NS: UTS namespace: built-in (pass)
    CONFIG_IPC_NS: IPC namespace: built-in (pass)
    CONFIG_PID_NS: PID namespace: built-in (pass)
    CONFIG_NET_NS: Network namespace: built-in (pass)
  CONFIG_NET: Networking support: built-in (pass)
    CONFIG_INET: TCP/IP networking: built-in (pass)
      CONFIG_IPV6: The IPv6 protocol: built-in (pass)
    CONFIG_NETFILTER: Network packet filtering framework (Netfilter): built-in (pass)
      CONFIG_NETFILTER_ADVANCED: Advanced netfilter configuration: built-in (pass)
      CONFIG_NF_CONNTRACK: Netfilter connection tracking support: module (pass)
      CONFIG_NETFILTER_XTABLES: Netfilter Xtables support: built-in (pass)
        CONFIG_NETFILTER_XT_TARGET_REDIRECT: REDIRECT target support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_COMMENT: "comment" match support: module (pass)
        CONFIG_NETFILTER_XT_MARK: nfmark target and match support: module (pass)
        CONFIG_NETFILTER_XT_SET: set target and match support: module (pass)
        CONFIG_NETFILTER_XT_TARGET_MASQUERADE: MASQUERADE target support: module (pass)
        CONFIG_NETFILTER_XT_NAT: "SNAT and DNAT" targets support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_ADDRTYPE: "addrtype" address type match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_CONNTRACK: "conntrack" connection tracking match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_MULTIPORT: "multiport" Multiple port match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_RECENT: "recent" match support: module (pass)
        CONFIG_NETFILTER_XT_MATCH_STATISTIC: "statistic" match support: module (pass)
      CONFIG_NETFILTER_NETLINK: module (pass)
      CONFIG_NF_NAT: module (pass)
      CONFIG_IP_SET: IP set support: module (pass)
        CONFIG_IP_SET_HASH_IP: hash:ip set support: module (pass)
        CONFIG_IP_SET_HASH_NET: hash:net set support: module (pass)
      CONFIG_IP_VS: IP virtual server support: module (pass)
        CONFIG_IP_VS_NFCT: Netfilter connection tracking: built-in (pass)
        CONFIG_IP_VS_SH: Source hashing scheduling: module (pass)
        CONFIG_IP_VS_RR: Round-robin scheduling: module (pass)
        CONFIG_IP_VS_WRR: Weighted round-robin scheduling: module (pass)
      CONFIG_NF_CONNTRACK_IPV4: IPv4 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_REJECT_IPV4: IPv4 packet rejection: module (pass)
      CONFIG_NF_NAT_IPV4: IPv4 NAT: unknown (warning)
      CONFIG_IP_NF_IPTABLES: IP tables support: module (pass)
        CONFIG_IP_NF_FILTER: Packet filtering: module (pass)
          CONFIG_IP_NF_TARGET_REJECT: REJECT target support: module (pass)
        CONFIG_IP_NF_NAT: iptables NAT support: module (pass)
        CONFIG_IP_NF_MANGLE: Packet mangling: module (pass)
      CONFIG_NF_DEFRAG_IPV4: module (pass)
      CONFIG_NF_CONNTRACK_IPV6: IPv6 connetion tracking support (required for NAT): unknown (warning)
      CONFIG_NF_NAT_IPV6: IPv6 NAT: unknown (warning)
      CONFIG_IP6_NF_IPTABLES: IP6 tables support: module (pass)
        CONFIG_IP6_NF_FILTER: Packet filtering: module (pass)
        CONFIG_IP6_NF_MANGLE: Packet mangling: module (pass)
        CONFIG_IP6_NF_NAT: ip6tables NAT support: module (pass)
      CONFIG_NF_DEFRAG_IPV6: module (pass)
    CONFIG_BRIDGE: 802.1d Ethernet Bridging: module (pass)
      CONFIG_LLC: module (pass)
      CONFIG_STP: module (pass)
  CONFIG_EXT4_FS: The Extended 4 (ext4) filesystem: module (pass)
  CONFIG_PROC_FS: /proc file system support: built-in (pass)

What happened?

I tried to install OCI using the Helm extension and got an error.

...snip...
  extensions:
    helm:
      charts:
        - name: flux-operator
           chartname: oci://ghcr.io/controlplaneio-fluxcd/charts/flux-operator
           namespace: flux-system
           order: 3
           values: ""

Steps to reproduce

1. 2. 3.

Expected behavior

No response

Actual behavior

No response

Screenshots and logs

k0s[184423]: time="2024-10-25 10:45:34" level=error msg="Observed a panic" Chart="{k0s-addon-chart-flux-operator kube-system}" component=extensions_controller controller=chart controllerGroup=helm.k0sproject.io controllerKind=Chart error="<nil>" name=k0s-addon-chart-flux-operator namespace=kube-system panic="runtime error: invalid memory address or nil pointer dereference" panicGoValue="\"invalid memory address or nil pointer dereference\"" reconcileID="\"5af43e85-83a9-4985-8605-6c02f25acded\"" stacktrace="goroutine 2618 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x4458b70, 0xc002e93410}, {0x35d7e40, 0x658caf0})
        k8s.io/apimachinery@v0.31.1/pkg/util/runtime/runtime.go:107 +0xbc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile.func1()
        sigs.k8s.io/controller-runtime@v0.19.0/pkg/internal/controller/controller.go:105 +0x112
panic({0x35d7e40?, 0x658caf0?})
        runtime/panic.go:770 +0x132
helm.sh/helm/v3/pkg/registry.(*Client).Tags(0x0, {0xc001992246?, 0xc003bbed58?})
        helm.sh/helm/v3@v3.16.1/pkg/registry/client.go:671 +0x10f
helm.sh/helm/v3/pkg/downloader.(*ChartDownloader).getOciURI(0xc003bbf188, {0xc001992240, 0x38}, {0x0, 0x0}, 0xc002137dd0)
        helm.sh/helm/v3@v3.16.1/pkg/downloader/chart_downloader.go:154 +0x128
helm.sh/helm/v3/pkg/downloader.(*ChartDownloader).ResolveChartVersion(0xc003bbf188, {0xc001992240, 0x38}, {0x0, 0x0})
        helm.sh/helm/v3@v3.16.1/pkg/downloader/chart_downloader.go:199 +0xfdb
helm.sh/helm/v3/pkg/downloader.(*ChartDownloader).DownloadTo(0xc003bbf188, {0xc001992240, 0x38}, {0x0?, 0xc003bbf148?}, {0xc000e8b9c0, 0x1b})
        helm.sh/helm/v3@v3.16.1/pkg/downloader/chart_downloader.go:90 +0x4f
github.com/k0sproject/k0s/pkg/helm.(*Commands).locateChart(0xc000918c00, {0xc001992240?, 0xb?}, {0x0, 0x0})
        github.com/k0sproject/k0s/pkg/helm/helm.go:198 +0x2be
github.com/k0sproject/k0s/pkg/helm.(*Commands).InstallChart(0xc000918c00, {0x4458b70, 0xc002e93410}, {0xc001992240, 0x38}, {0x0, 0x0}, {0xc000aa3980, 0xd}, {0xc000aa3970, ...}, ...)
        github.com/k0sproject/k0s/pkg/helm/helm.go:228 +0x171
github.com/k0sproject/k0s/pkg/component/controller.(*ChartReconciler).updateOrInstallChart(_, {_, _}, {{{0x31b6b0b, 0x5}, {0xc0020b2ce0, 0x1a}}, {{0xc0016468c0, 0x1d}, {0x0, ...}, ...}, ...})
        github.com/k0sproject/k0s/pkg/component/controller/extensions_controller.go:288 +0x3d7
github.com/k0sproject/k0s/pkg/component/controller.(*ChartReconciler).Reconcile(0xc0016761b0, {0x4458b70, 0xc002e93410}, {{{0xc000aa3960, 0xb}, {0xc0016468c0, 0x1d}}})
        github.com/k0sproject/k0s/pkg/component/controller/extensions_controller.go:224 +0x350
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile(0xc000b9d840?, {0x4458b70?, 0xc002e93410?}, {{{0xc000aa3960?, 0x0?}, {0xc0016468c0?, 0x0?}}})
        sigs.k8s.io/controller-runtime@v0.19.0/pkg/internal/controller/controller.go:116 +0xd4
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler(0x4489440, {0x4458ba8, 0xc0008f6dc0}, {{{0xc000aa3960, 0xb}, {0xc0016468c0, 0x1d}}})
        sigs.k8s.io/controller-runtime@v0.19.0/pkg/internal/controller/controller.go:303 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem(0x4489440, {0x4458ba8, 0xc0008f6dc0})
        sigs.k8s.io/controller-runtime@v0.19.0/pkg/internal/controller/controller.go:263 +0x21d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2()
        sigs.k8s.io/controller-runtime@v0.19.0/pkg/internal/controller/controller.go:224 +0x8a
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2 in goroutine 2579
        sigs.k8s.io/controller-runtime@v0.19.0/pkg/internal/controller/controller.go:220 +0x490

Additional context

No response

jnummelin commented 1 day ago

Looking at the stack trace, looks like Helm is choking on trying to figure out the chart version. Could you try with adding a version field to the config?

chattytak commented 21 hours ago

Thanks for doing the research. As you taught me, I added the version tag and it installed correctly. I am not familiar with OCI, but when installing with the normal helm command, it installs correctly without the version specification. Is this a bug that k0s has?

jnummelin commented 5 hours ago

Is this a bug that k0s has?

That I'm not 100% sure yet. 😄

k0s uses Helm as a library so there has been some discrepancies between it's CLI usage and how it behaves as a library in the past.

What I'm currently leaning towards is that when you use Helm via CLI and do NOT specify --version, it'll default to latest (at least when using OCI). That seems to happen as a separate step before the actual install part and thus k0s is not doing that.

However, I must say that using latest is a bad idea in this case, similarly as for container images. You can find lot of discussion on the downsides, and regrets, when using latest with images. IMO most of the things apply to charts too.

twz123 commented 5 hours ago

As Jussi mentioned, the panic happens because of the missing version, but that's only part of it. In fact, I suspect the panic would happen for any version that isn't a valid semver. IIUC the course of events is as follows:

  1. K0s attempts to download a chart hosted on an OCI registry via Helm.
  2. K0s configures Helm's chart downloader so that it cannot communicate with OCI registries, i.e. the chart downloader's OCI registry client is nil.
  3. The chart version is not a semver. It's empty in this case, but I suspect the panic would happen for anything that's not a semver (including "latest").
  4. Helm tries to load all tags of the chart from the OCI registry to find the right tag for the given version by doing some clever semver matching stuff.
  5. Oops, the registry client is nil ...
  6. 💥

This only happens if the charts are downloaded via OCI, and only if the chart version is not a valid semver.

So this is definitely a bug on k0s' side. The question is how to fix it:

  1. K0s could wire the registry client for the chart downloader. This opens up a whole lot of ambiguity about the actual chart tag that gets installed.
  2. K0s could try to enforce a semver for OCI chats and refuse to download them otherwise.

Either way, a panic is obviously not a good outcome.

jnummelin commented 4 hours ago

Heh, I based my assumptions on the Helm CLI --version help text:

If this is not specified, the latest version is used

So in this case the latest version is determined by looking at all the tags and like you @twz123 said, doing some "clever semver stuff" 😄