zarf-dev / zarf

DevSecOps for Air Gap & Limited-Connection Systems. https://zarf.dev/
Apache License 2.0
1.38k stars 166 forks source link

Segfault on repeated `zarf destroy --confirm` #2978

Closed JoeHCQ1 closed 1 month ago

JoeHCQ1 commented 1 month ago

Environment

Device and OS: GitLab CI Runner, Ubuntu App version: 0.38.3

For more info, see this repo at this commit: https://github.com/defenseunicorns/uds-capability-rook-ceph/tree/0317b648f2df6c3a077eff85d0e351e1296fea70

And this run of the pipeline: https://github.com/defenseunicorns/uds-capability-rook-ceph/actions/runs/10744873812/job/29802734565

Steps to reproduce

  1. Run zarf destroy --confirm when zarf is in the cluster.
  2. Run it again.

If I've interpreted the data correctly, it was running zarf destroy when zarf wasn't there that caused the segfault (note the upgrade test failed, causing zarf to never be re-installed). It is also possible this is actually a problem in the UDS CLI who calls Zarf.

Expected result

Errors may occur, but are reported intelligently, never as a segfault.

Actual Result

Encountered segfault.

Visual Proof (screenshots, videos, text, etc)

Run uds run remove-zarf-package

  •  Saving log file to /tmp/maru-2024-09-06-21-01-48-2822419660.log

  •  Running "Patch cleanupPolicy to really destroy all the data"
cephcluster.ceph.rook.io/rook-ceph patched (no change)

  ✔  Completed "Patch cleanupPolicy to really destroy all the data"

  •  Running "Remove Zarf from cluster"

 NOTE  Using config file
 /home/runner/work/uds-capability-rook-ceph/uds-capability-rook-ceph/zarf-config.yaml

 NOTE  Saving log file to /tmp/zarf-2024-09-06-21-01-48-426499508.log
  •  Waiting for cluster connection
  •  Waiting for cluster connection

 WARNING  Failed to load the Zarf State from the cluster.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x3c7b385]

goroutine 1 [running]:
github.com/defenseunicorns/zarf/src/cmd.init.func3(0x8481b80?, {0x4902c5a?, 0x4?, 0x4902c5e?})
    /home/runner/go/pkg/mod/github.com/defenseunicorns/zarf@v0.36.1/src/cmd/destroy.go:51 +0xe5
github.com/spf13/cobra.(*Command).execute(0x8481b80, {0xc001ec0380, 0x1, 0x1})
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:985 +0xaca
github.com/spf13/cobra.(*Command).ExecuteC(0x8486ee0)
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:1117 +0x3ff
github.com/spf13/cobra.(*Command).ExecuteContextC(...)
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:[10](https://github.com/defenseunicorns/uds-capability-rook-ceph/actions/runs/10744873812/job/29802734565#step:12:11)50
github.com/defenseunicorns/zarf/src/cmd.Execute({0x5a80028?, 0x85532c0?})
    /home/runner/go/pkg/mod/github.com/defenseunicorns/zarf@v0.36.1/src/cmd/root.go:67 +0x65
github.com/defenseunicorns/uds-cli/src/cmd.init.func13(0x847e4e0?, {0x4902c5a?, 0x4?, 0x4902c5e?})
    /home/runner/work/uds-cli/uds-cli/src/cmd/vendored.go:76 +0x86
github.com/spf13/cobra.(*Command).execute(0x847e4e0, {0xc0017950e0, 0x2, 0x2})
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:989 +0xab1
github.com/spf13/cobra.(*Command).ExecuteC(0x847f340)
    /home/runner/go/pkg/mod/github.com/spf13/cobra@v1.8.1/command.go:[11](https://github.com/defenseunicorns/uds-capability-rook-ceph/actions/runs/10744873812/job/29802734565#step:12:12)17 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
    /home/runner/go/pkg/mod/github.com/spf[13](https://github.com/defenseunicorns/uds-capability-rook-ceph/actions/runs/10744873812/job/29802734565#step:12:14)/cobra@v1.8.1/command.go:1041
github.com/defenseunicorns/uds-cli/src/cmd.Execute()
    /home/runner/work/uds-cli/uds-cli/src/cmd/root.go:67 +0x1a
main.main()
    /home/runner/work/uds-cli/uds-cli/main.go:19 +0x47
     ERROR:  Failed to run action: command "Remove Zarf from cluster" timed out after 0 seconds
Error: Process completed with exit code 1.

Severity/Priority

Additional Context

The rook ceph repo creates a custom zarf init thing, not a package. This is why Zarf is there or not there as the rook/ceph capability is there/not there.

AustinAbro321 commented 1 month ago

This is a good issue and needs to be fixed, but is a duplicate of #2700