Closed uhthomas closed 1 year ago
I think I know what's causing this.
I have a patch, which looks like this:
configPatches: [{
op: "add"
path: "/machine"
value: {
install: diskSelector: wwid: "naa.50026b7381886726"
network: interfaces: [{
addresses: ["10.0.0.102"]
bond: interfaces: ["eth4", "eth5"]
}]
}
}]
This almost certainly overrides basically all the machine defaults. As such config.MachineConfig.MachineKubelet.KubeletExtraArgs
will inevitably panic.
the patch is being used wrong, if you add a /machine
path, it'll remove everything under /machine
and only add in what you passed, please use the path as /machine/install/diskSelector
, also regarding the network interfaces, there's no parent being provided, the parent interface name is required.
Yeah, I understand.
Is it okay to leave this open to track a feature which either makes it easier to know what's happening or reject a configuration which completely removes the base configuration? I imagine it's likely never intentional to do this.
reject a configuration which completely removes the base configuration
that is tricky, since it's possible for a user to provide the whole machine
keys with user provided one
I don't doubt that. I'd like to suggest validation be extended further as seemingly simple and valid patches don't work either. To be clear, I understand why, but it's just not very intuitive. These patch errors surface when the machine tries to provision with a config and isn't ideal.
{
op: "replace"
path: "/machine/install/diskSelector/wwid"
value: "abc"
}
sidero-controller-manager-7b559896bd-tgvkh manager 2023/03/13 20:46:08 failure applying rfc6902 patches to machine config: replace operation does not apply: doc is missing path: /machine/install/diskSelector/wwid: missing value
{
op: "add"
path: "/machine/install/diskSelector/wwid"
value: "abc"
}
sidero-controller-manager-7b559896bd-tgvkh manager 2023/03/13 22:02:03 failure applying rfc6902 patches to machine config: add operation does not apply: doc is missing path: "/machine/install/diskSelector/wwid": missing value
Or even:
{
"op": "add",
"path": "/machine/network/interfaces/-",
"value": {
"interface": "bond0",
"addresses": [
"10.0.0.105"
]
}
}
sidero-controller-manager-7b559896bd-tgvkh manager 2023/03/13 23:12:29 failure applying rfc6902 patches to machine config: add operation does not apply: doc is missing path: "/machine/network/interfaces/-": missing value
Am I missing something? The docs don't seem to offer much help.
https://www.talos.dev/v1.3/talos-guides/configuration/patching/
yes, this path doesn't exist in the default config - talosctl gen config
provides you an easier way to test it. machine.network: {}
is the default config state
I've been struggling to provision some servers as they would get stuck when trying to fetch their metadata.
It looks like this is because Sidero panics when attempting to serve said metadata. This happens for all servers. None of them can load the metadata correctly.
I even tried starting from scratch and see the same error.
https://github.com/siderolabs/sidero/blob/18116bcabe77f1e7aa8d33b4d10cf2ff4a7cd18b/app/sidero-controller-manager/internal/metadata/metadata_server.go#LL297