cloudbase / garm

GitHub Actions Runner Manager
Apache License 2.0
136 stars 26 forks source link

garm-cli pool create leads to a SIGSEGV: segmentation violation #304

Closed PhilipVinc closed 1 month ago

PhilipVinc commented 1 month ago

Hello,

I'm running a garm instance (version 0.1.4) and it has worked fine so far. However, recently I started getting the following error when trying to create a new pool.

Do you have any suggestion on what might be broken? I've tried restarting garm but it did not help.

ubuntu@github-actions-runner-manager:~$ sudo pkgx garm-cli pool add --org b3e09917-64b3-4cf2-a3eb-89ee1d2ef415 --provider-name openstack_external --flavor vd.2 --image ubuntu-24.04-server-cloudimg-amd64.img --tags 'virtualdata, virtualdata-vd.2.extradisk'  --max-runners 1 --min-idle-runners 0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x61f312f0314c]

goroutine 1 [running]:
github.com/cloudbase/garm/cmd/garm-cli/cmd.init.func23(0x61f313750e60, {0x61f312f13772?, 0x4?, 0x61f312f13776?})
    /__w/pantry/pantry/builds/github.com__cloudbase__garm-0.1.4/cmd/garm-cli/cmd/pool.go:261 +0x4cc
github.com/spf13/cobra.(*Command).execute(0x61f313750e60, {0xc00072e380, 0xe, 0xe})
    /__w/pantry/pantry/builds/github.com__cloudbase__garm-0.1.4/vendor/github.com/spf13/cobra/command.go:983 +0xaca
github.com/spf13/cobra.(*Command).ExecuteC(0x61f3137536a0)
    /__w/pantry/pantry/builds/github.com__cloudbase__garm-0.1.4/vendor/github.com/spf13/cobra/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
    /__w/pantry/pantry/builds/github.com__cloudbase__garm-0.1.4/vendor/github.com/spf13/cobra/command.go:1039
github.com/cloudbase/garm/cmd/garm-cli/cmd.Execute()
    /__w/pantry/pantry/builds/github.com__cloudbase__garm-0.1.4/cmd/garm-cli/cmd/root.go:57 +0x13b
main.main()
    /__w/pantry/pantry/builds/github.com__cloudbase__garm-0.1.4/cmd/garm-cli/main.go:20 +0xf
gabriel-samfira commented 1 month ago

Would you mind trying out the CLI from the latest stable version. Just the CLI. You don't need to update the server. There was a nil pointer dereference that was fixed a while back in the cli.

PhilipVinc commented 1 month ago

Thank you @gabriel-samfira , indeed it solved the problem. The error I get is now

Error: [POST /organizations/{orgID}/pools][409] CreateOrgPool default {"error":"Conflict","details":"pool with the same image and flavor already exists on this provider"}

This happens because I'm trying to create a pool with the same image/flavor combination as one that already exists, but change some --extra-specs to make it boot from the volume and with much more disk space.

Is there some way to work around this issue?

gabriel-samfira commented 1 month ago

Yup. That is a limitation that has been removed in 0.1.5. The reasoning back then was that if you need to define another pool with the same image and flavor, you might as well scale up the pool you already have. But if we take extra specs and runner pools into account, it became clear that the over opinionated limitation was more of a pain than something desirable.

So in 0.1.5 this was removed. You will have to upgrade GARM itself. If you're willing to do that (should be safe), please make a backup of the database first. And make sure you read the release notes. A number of settings have been moved to the database, including github credentials. Also, credentials have been split into "github endpoints" (github.com, your own GHES, etc) and the actual credentials (App, PAT) that are tied to endpoints.

The migration of settings is done automatically when GARM starts for the first time using v0.1.5. But this also means that some configs have been deprecated. They still exist to facilitate the migration to the DB, but will no longer be used once the migration is done.

Feel free to join us on Slack if you would like to upgrade and if you run into any issues.

gabriel-samfira commented 1 month ago

Closing this for now. Feel free to reopen if needed.