Closed Ognian closed 9 months ago
We don't seem to pass any default value for --node-name
to k3s. As far as I can tell k3s default behavior is to use the hostname to construct a node name with some suffix: here, I'm not sure if the suffix will be skipped altogether if we set --node-name
, we need to try that out.
I'm wondering if nodeID is important in this case and if we should keep it. Let's see what --node-name
flag does first and then we decide.
What we do by default is: https://github.com/kairos-io/provider-kairos/blob/0d636d2b2cd424eee5961b6fc14c4842ef74073d/internal/role/p2p/worker.go#L71 which causes this - it is by default in order to avoid collisions if the same hostname is given to multiple nodes.
To override the default logic you can set replace_args: true
.
@Ognian the default behavior targets less experienced users that shouldn't have to configure too many things. For the experienced ones, there is replace_args: true
that requires them the configure k3s manually.
We will document the use of that flag. Are you ok with this plan?
@jimmykarily yes, sounds reasonable
@jimmykarily after power cut off I was not able to get the worker node up and running in the cluster. There is a chance that this was due to generating a different node name, after getting up again, maybe something in the state got corrupted … #1227 was also the result of this power cut off… Could you tell me how to configure a fixed node name?
@Ognian with this config:
#cloud-config
users:
- name: kairos
passwd: kairos
ssh_authorized_keys:
- github:jimmykarily
k3s:
enabled: true
replace_args: true
args:
- --node-name=my-node
I got this node:
localhost:/home/kairos # kubectl get nodes
NAME STATUS ROLES AGE VERSION
my-node Ready control-plane,master 53s v1.21.14+k3s1
I used this image: https://github.com/kairos-io/provider-kairos/releases/download/v1.6.1/kairos-opensuse-leap-v1.6.1-k3sv1.21.14+k3s1.iso
Is this what you need? I'll see if I can put it somewhere in the docs as a hint.
@jimmykarily in my case the raspberry pi is the worker not the master.
So I have to use it with k3s_agent
not k3s
. Does this work the same way?
replace_args: true
suggests that I have to provide all args, not only the ones I would like to override, so which are the other ones I have to provide? An agent needs the token and the server values, where do they come from?
--node-name
is a flag passed to the agent: https://docs.k3s.io/cli/agent#node
From the same page:
Note that servers also run an agent, so all flags listed on this page are also valid for use on servers.
I think it will work similarly for all your nodes.
For the rest of the flags, what we do in code seems to complex to achieve manually. I've put the documentation for the "hardcoded" node name in the single-node setup for this reason. I don't think this manual solution works well in multi-node setups where they have to discover each other etc. (thus having to figure out multiple k3s args)
If we want to support setting node names manually on multi-node clusters, I think we need to expose a special configuration option for that. cc @mudler
maybe we could add some templating sugar, as we have already for hostnames and other fields 🤔
Closed by the merged PR automatically. Re-opening until we decide how to move forward.
OK I finally understood what was really going on:
The Problem is not that the node name is not set, the problem ist that the k3s parameter --with-node-id
is set hard coded into the kairos-agent code https://github.com/kairos-io/provider-kairos/blob/95bc4b4c37253ecd4a50246064f4e06a027556c1/internal/role/p2p/worker.go#L72.
--with-node-id
changes the node name to the name AND a generated random hash.
We want the other needed k3s parameters to be set.
Actually the only thing which should be documented is the fact that it works exactly like I described and that it is intentional.
I don't understand the documentation: There is no need to set --with-node-id since it is ALWAYS set regardless of the replace_args
parameter ;
And like stated in the k3s docs it just appends a random id to the node name
or are you planning to remove the --with-node-id when replace_args is set to true?
When you specify k3s args, you are essentially replacing the default one. In other words, if you want to specify one, you are switching to "manual" mode which means you need to pass all the needed args including --with-node-id
which was passed automatically otherwise.
I think that's what this line means as well: https://github.com/kairos-io/kairos-docs/pull/128/files#diff-4ae3693f5ecdf0f5f08084df28f4dde27bff0108c7ad93224c1e5662b87c2d55R66
@mauromorales am I correct?
just to clarify:
this line is always appending --with-node-id
regardless of any conditions and this is what I found out by testing various combinations of replace_env
and replace_args
flag....
As far as I can tell, this case would override them:
Ahh, OK my fault.
But I'm using P2P so if I set replace_args: true
than I would have to manually provide the P2P values.
From the code I tried to understand what parameters I have to set up manually (is the following correct ?):
Setting the --flannel-iface=edgevpn0
parameter is clear, since default is to use the VPN and therefore set this.
To provide --node-ip %s
I need to know the ip. If using KubeVip this is the ip of the Kubevip Interface else the ip of the first non local interface. But how to set a non fixed ip (i.e. dhcp) in the config file?
@mudler @jimmykarily do you know which parameters would need to be set?
@mudler @jimmykarily do you know which parameters would need to be set?
No, I'd have to follow the code and collect them all :( . Maybe Ettore's suggestion with the templating (see previous comments in this issue) is the best way to solve this. This needs to be coded of course. As a workaround for now, a fast way to collect them all would be to let the agent spin up the k3s agent with the current config and then check the running process to see what flags were passed. Then change the config to pass the same args with replace_args
set to true and node name set to a hardcoded value and --with-node-id
set to false.
Since we don't have templating for the node name, this means, each node will need to be deployed with a different kairos config.
Kairos version:
NAME="kairos-opensuse-arm-rpi" VERSION="v1.3.2-k3sv1.25.4+k3s1" ID="kairos" ID_LIKE="kairos-opensuse-arm-rpi" VERSION_ID="v1.3.2-k3sv1.25.4+k3s1" PRETTY_NAME="kairos-opensuse-arm-rpi v1.3.2-k3sv1.25.4+k3s1" ANSI_COLOR="0;32" BUG_REPORT_URL="https://github.com/kairos-io/kairos/issues/new/choose" HOME_URL="https://github.com/kairos-io/provider-kairos" IMAGE_REPO="quay.io/kairos/kairos-opensuse-arm-rpi" IMAGE_LABEL="latest" GITHUB_REPO="kairos-io/provider-kairos" VARIANT="core" FLAVOR="opensuse"
CPU architecture, OS, and Version:
Linux rpi4node 5.14.21-150400.24.33-default #1 SMP PREEMPT_DYNAMIC Fri Nov 4 13:55:06 UTC 2022 (76cfe60) aarch64 aarch64 aarch64 GNU/Linux
Describe the bug
k3s node name should be the same like the host name host name is: rpi4node node name is: rpi4node-ec2407ca
To Reproduce
Expected behavior
Logs
Additional context