siderolabs / omni

SaaS-simple deployment of Kubernetes - on your own hardware.
Other
437 stars 22 forks source link

[bug] talosctl reboot doesn't work on omni nodes #177

Closed rsmitty closed 3 months ago

rsmitty commented 3 months ago

Is there an existing issue for this?

Current Behavior

Throws permission denied when issuing:

❯ talosctl -n talos-xxu-ebt reboot ◰ watching nodes: [talos-xxu-ebt]

Expected Behavior

talosctl reboot works to reboot the node in the same way that it does in the UI.

Steps To Reproduce

Issue talosctl reboot against a worker node in an omni-based cluster.

What browsers are you seeing the problem on?

No response

Anything else?

No response

utkuozdemir commented 3 months ago

This is caused by the "action tracking" feature of talosctl that is enabled by default.

Action tracking feature shows the progress of the reboot. To be able to do that, it attempts to read the boot id before issuing the reboot GRPC call: https://github.com/siderolabs/talos/blob/main/cmd/talosctl/pkg/talos/action/tracker.go#L325-L341

We do not allow reading files even as admin in Omni.

Disabling action tracking by adding --wait=false to the command works as expected.

We can consider making the boot id readable (maybe expose it as a resource?) or using a fallback mechanism if we cannot read it.

smira commented 3 months ago

We need two kinds of changes:

utkuozdemir commented 3 months ago

Duplicate of https://github.com/siderolabs/talos/issues/7197

will track it on the issue above.

ArcherSeven commented 3 months ago

@utkuozdemir how is this a duplicate of that card? I am an admin, and have not attempted to use "--wait" in any form, and am hitting this issue.

smira commented 3 months ago

@ArcherSeven --wait=true is the default, just use --wait=false until we have a fix in place.