Open smst329 opened 1 week ago
Without the logs, it's impossible to tell. If you have i915 by chance, it might be fixed by adding i915-ucode
system extension. (This is going to be fixed in 1.9).
I just ran into this same issue again after destroying and recreating a cluster running on 1.8.3 while having the extension enabled on the node.
Will this be resolved in the mentioned 1.9 fix?
If you have i915 maybe. If you don't then probably not. I am not sure they fully understand all the causes of their boot loops. I don't have an i915 so they're supposition is wrong again.
I'd like them to reconsider infinite boot loops as a strategy for responding to a problem. Like what conditions is a reboot changing where on 13th reboot things work again but they didn't on the 12th. Like does 12 reboots clear a previous install? Does 12 reboots cause a USB stick to fly out of the machine? Does 12 reboots fix the dhcp server?
They dont have to agree, but I think infinite boot loops are bad design. There are other kind of loops other than a boot loop. And they could even have a progressive backoff like the k8s crash loop backoff so its not a hot loop.
The I915 drivers have some bad history of bootloops and crashes.
You can get into your machine again by adding i915.modeset=0
in the kernel parameters and it just runs fine for now.
At the moment of writing I can not add extensions to 1.8.3 machines. It was the same with the 1.8.2 upgrade for a while.
I added the i915.modeset=0
as extraKernelArgs and the intel NUC nodes with I915 video are now stable.
I have to say that this should be a stern warning not to jump on the latest version until it settles down. It's now in a short time I am evaluating talos with OMNI that we have seen such issues with the 1.8 releases. I love talos, but it's the release process which worries me a bit.
We plan to remove i915 driver out of base Talos in 1.9, so that it will use UEFI for the framebuffer (unless you want to add an extension). #9728
Bug Report
Talos ISO just reboots infinitely forever and never stops.
https://github.com/siderolabs/talos/issues/9702 ^ In that bug report they kept saying I needed to wipe the disk/previous install.
Funny thing happened today, new hard drive came in the mail, and there is still an infinite boot loop. I didn't know hard drives came pre-installed with talos.
I'm just reporting the bug, in case it affects any potential or current customers.
Description
Logs
Environment