openshift / machine-config-operator

Apache License 2.0
245 stars 402 forks source link

machine-specific machineconfigs #1720

Open cgwalters opened 4 years ago

cgwalters commented 4 years ago

Ironically today MachineConfig objects target a pool - not a specific machine.

Managing machine-specific configuration today

Follow up to this comment. Today if you want to manage per-machine configuration such as e.g. statically set hostname, the best way to do this is to take the worker.ign file generated by openshift-installer and create e.g .worker-foo.ign and worker-bar.ign copies, then modify them to include configuration specific to the machine, and provide that Ignition to the node.

For the "provide that Ignition to the node" phase, if you're using MachineSets via MachineAPI, then one would need to edit the machineset object to point to a new user data secret. Note this will eventually conflict with having the MCO manage userdata but that enhancement was reverted. But when we get there, we can teach the MCO to retain any additional config it finds in the machineset perhaps?

The OpenShift 4.8 documentation will describe how to use https://github.com/coreos/butane which is a high level tool for managing Ignition, but it isn't yet ergonomic to "backconvert" that pointer Ignition to butane, then output a new Ignition config.

Using the Live ISO

Additionally, one can use the Live ISO which can be programmed to have its own Ignition configuration that e.g. pulls a container which performs hardware inspection, and then dynamically generates a configuration which is passed to coreos-installer.

Background information

If we had a way to provide "machine specific machineconfigs" then the admins could provide those MCs as additional manifests and it'd all Just Work.

But...this gets into the "node identity" problem. We'd need to define a way for the MCS to identify the node it's serving the config to.

Perhaps the simplest thing is to assume anyone who wants this is statically assigning hostnames, and we change Ignition to include a header with the node's hostname.

(I guess we could do something like a reverse lookup of the requester's IP address too)

One messy aspect of this too is that we can't include these bits configs in the main rendered-$pool configs which means the MCS needs to generate it internally. (I guess we could create a separate rendered-node-$x config too?)

michaelgugino commented 4 years ago

Perhaps the simplest thing is to assume anyone who wants this is statically assigning hostnames, and we change Ignition to include a header with the node's hostname.

Definitely we should do this with or without machine-specific machineconfigs.

(I guess we could do something like a reverse lookup of the requester's IP address too)

This is not guaranteed to work in all environments, and I think we discovered during troubleshooting 4.4 release that the IPs that show up in the MCS during first boot are VIPs, not instance IPs.

Including the hostname and IP on every request to the MCS would greatly aid in determining when a machine failed to boot/request an ignition vs failing to join cluster after getting ignition file.

cgwalters commented 4 years ago

One way to do this today is - for each machine one wants to have a custom config for, manually set things up to pass it a custom derived pointer config.

For example in a PXE install scenario, take the Ignition output from openshift-install create ignition-configs and edit that, rather than trying to customize the full Ignition config returned from the MCS.

openshift-bot commented 3 years ago

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

dhellmann commented 3 years ago

/remove-lifecycle stale

dhellmann commented 3 years ago

/lifecycle frozen

cgwalters commented 3 years ago

Perhaps one simple approach here is:

However, we could simplify things significantly if we said the MCD wouldn't be aware of this - we wouldn't support "day 2" reconfiguration for machine-specific configs. IOW if e.g. you want to change the static IP address for a node, you fully reprovision it.

Given that I think most of this per-node configuration wouldn't change often, that could be a useful balance.

larsks commented 2 years ago

@cgwalters what's the state of this issue? I need to configure bonding on the primary interface on a small openshift (4.10) cluster, and the interface names on the nodes aren't identical...so I need a couple of different machineconfig resources, applying to different subsets of nodes.

I was hoping I could create node-specific pools, but if I try something like...

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: ctl-0
spec:
  machineConfigSelector:
    matchLabels:
      kubernetes.io/hostname: ctl-0
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: ctl-0
  paused: false

...then MCO refuses to apply because:

W0623 23:44:20.353529       1 node_controller.go:827] can't get pool for node "ctl-0": node ctl-0 has both master role and custom role ctl-0

I guess the workaround is to bundle the desired configurations into a shell script that does something like...

#!/bin/bash

if [[ $HOSTNAME = ctl-0 ]]; then
  cp /etc/files/config-for-host0 /etc/actual/path/config
elif [[ $HOSTNAME = ctl-1 ]]; then
  cp /etc//files/config-for-host1 /etc/actual/path/config
fi

Etc.

Is there a better way?

cgwalters commented 2 years ago

@larsks See the top comment https://github.com/openshift/machine-config-operator/issues/1720#issue-614885832 (which I reworked to clarify with the best solution today)

(Networking specifically also touches on nmstate, which is another thing)

larsks commented 2 years ago

(Networking specifically also touches on nmstate, which is another thing)

Right, and for this I would love to be able to use nmstate (in particular because it's easy to implement host-specific configs), but that explicitly can't be used to configure the primary host interface. The recommendation is to use machineconfigs :).

larsks commented 2 years ago

Reading through the top comment...

For the "provide that Ignition to the node" phase, if you're using MachineSets via MachineAPI, then one would need to edit the machineset object to point to a new user data secret...

For a cluster installed using e.g. the assisted installer, there is no MachineSet that covers the controllers. For a three-node cluster:

$ oc get -A machineset
NAMESPACE               NAME                            DESIRED   CURRENT   READY   AVAILABLE   AGE
openshift-machine-api   nerc-ocp-infra-5xv2p-worker-0   0         0                             28d

I also see the implication there that you're describing an install-time procedure, rather than something that could be a post-install configuration task.

My reading of that comment is that "today, it is generally not possible to apply machine-specific machineconfig resources". It sounds like I'm going to need to go with my hacky shell script (although even that plan was complicated by the fact that I can't use the directories section of an ignition config to create directories, so I need to find pre-existing directories in /etc into which I can copy files in a non-crazy fashion...)

cgwalters commented 2 years ago

Today, nothing stops you from writing persistent files into /etc outside of MachineConfigs. So, one approach today is to make the change directly live, via ssh or a privileged container. We're unlikely to make a change that would break that anytime soon without an opt-in. But then, in order to ensure your system is reprovisionable, it'd be good to aim to also make the change in the Ignition config provided to each node.

For a cluster installed using e.g. the assisted installer, there is no MachineSet that covers the controllers.

Yeah, though I think the assisted installer can and should expose a way to customize the Ignition provided to each node...I thought it does something like this internally.

jlebon commented 2 years ago

This came up yet again internally. Would another hackaround for this be to add a URL to an Ignition config in ignition.config.merge[] (in a MachineConfig of course) which can provide a different config based on IP or MAC? (Obviously then requires setting up that secondary service.)