siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.24k stars 505 forks source link

[Imager] Segmentation Violation when creating new images #8987

Open preeefix opened 1 month ago

preeefix commented 1 month ago

Bug Report

In the proccess of bringing up a new board on Talos (LibreTech ROC-RK3328-CC), manually building the overlay causes a segmentation violation during imager builds.

Description

A friend of mine has some LibreTech ROC-RK3328-CC boards that he'd like to include into his existing Talos cluster and asked if I could build an image for them.

I started out by using the OrangePi as a base in the siderolabs/sbc-rockchip respository. I managed to get the image to build (available as ghcr.io/preeefix/sbc-rockchip) but when using the included profile.yaml, all I would receive were segmentation violations.

I followed this up by creating a clean overlay via siderolabs/sbc-template, with only the roc-cc-rk3398 build setup (just u-boot), and it again compiles and saves the container, but imager spits out a segmentation violation.

It's likely that I'm missing something, but after searching around I'm unable to determine what's going wrong.

profile.yaml

arch: arm64
platform: metal
secureboot: false
version: v1.7.5
input:
  kernel:
    path: /usr/install/arm64/vmlinuz
  initramfs:
    path: /usr/install/arm64/vmlinuz
  baseInstaller:
    imageRef: ghcr.io/siderolabs/installer:v1.7.5
overlay:
  name: roc-cc-rk3328
  image:
    imageRef: ghcr.io/preeefix/sbc-roc-cc-rk3328:8574bb9-dirty@sha256:2d9b606d4289b95225013d05a512409fe5542d9ce139b07099ccd3eb0a24c6de
output:
  kind: image
  outFormat: .xz

Logs

cat in/profile.yaml | docker run --rm -i -v ./out:/out -v ./in:/in ghcr.io/siderolabs/imager:v1.7.5 -
assembling the finalized profile...
pulling overlay...
    pulling ghcr.io/preeefix/sbc-roc-cc-rk3328:8574bb9-dirty@sha256:2d9b606d4289b95225013d05a512409fe5542d9ce139b07099ccd3eb0a24c6de...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x174ff9b]

goroutine 1 [running]:
github.com/siderolabs/talos/pkg/imager/profile.(*Profile).Validate(0x620fb8?)
        /src/pkg/imager/profile/profile.go:101 +0x17b
github.com/siderolabs/talos/pkg/imager.(*Imager).handleProf(0xc00015a6c8)
        /src/pkg/imager/imager.go:248 +0x37b
github.com/siderolabs/talos/pkg/imager.(*Imager).Execute(0xc00015a6c8, {0x1f1c430, 0xc00022c780}, {0x1c1e8af, 0x4}, 0xc000366180)
        /src/pkg/imager/imager.go:77 +0x1e8
github.com/siderolabs/talos/cmd/installer/cmd/imager.init.func1.1({0x1f1c430, 0xc00022c780})
        /src/cmd/installer/cmd/imager/root.go:187 +0xbbe
github.com/siderolabs/talos/pkg/cli.WithContext({0x1f1c370?, 0x2f6c7c0?}, 0xc00051bc88)
        /src/pkg/cli/context.go:40 +0x1a5
github.com/siderolabs/talos/cmd/installer/cmd/imager.init.func1(0xc000262400?, {0xc000365e50?, 0x4?, 0x1c1e9a3?})
        /src/cmd/installer/cmd/imager/root.go:58 +0x52
github.com/spf13/cobra.(*Command).execute(0x2edd920, {0xc0001b0010, 0x1, 0x1})
        /.cache/mod/github.com/spf13/cobra@v1.8.0/command.go:983 +0xaca
github.com/spf13/cobra.(*Command).ExecuteC(0x2edd920)
        /.cache/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
        /.cache/mod/github.com/spf13/cobra@v1.8.0/command.go:1039
github.com/siderolabs/talos/cmd/installer/cmd/imager.Execute()
        /src/cmd/installer/cmd/imager/root.go:208 +0x1a
main.main()
        /src/cmd/installer/main.go:19 +0x47

Environment

smira commented 1 month ago

In this particular case, you're generating a disk image, but disk image options are not specified:

output:
  kind: image
  imageOptions:
    diskSize: 8589934592
    diskFormat: raw
  outFormat: .xz
preeefix commented 1 month ago

Sometimes, rereading the docs would lead to a better. And not filing GH Issues when I'm exhausted.

Ended up getting it to build correctly with the following profile.yaml.

arch: arm64
platform: metal
secureboot: false
version: v1.7.5
overlay:
  name: libretech-roc-rk3328-cc
  image:
    imageRef: ghcr.io/preeefix/sbc-rockchip:0c603d0-dirty
output:
  kind: image
  imageOptions:
    diskSize: 1306525696
    diskFormat: raw
  outFormat: .xz

Though that leads to a discussion that imager doesn't fail in https://github.com/siderolabs/talos/blob/fbde9c556f0107734ff1216ea80d9156c35d4e3c/pkg/imager/profile/profile.go#L103 as part of the validation, though likely because the entire imageOptions block doesn't exist.

smira commented 1 month ago

yep, it's actually a bug that it crashes, as imageOptions is nil in this case