Open cgwalters opened 5 years ago
Though, we'd need new terminology like "bootRHCOS"...
Red Hat Pivot OS? All this OS does is pivot to an OSTree pulled from a container image. It is not certified for any otger purpose.
if we did this, we could streamline down the bootimages
That would be very helpful for OKD-on-FCOS, however that would require a few changes to RHCOS oscontainer build process.
@vrutkovs @cgwalters this is implemented by Vadim's commit here, right? https://github.com/openshift/installer/pull/2548/commits/c7da4745829deae4cfeb87cfb885a3e6259f36a3
Should we try to get this into master soon? (i.e. before tackling spec3 for OCP)
pullling the machine-os-content on the bootstrap host and pivoting adding downloading 800MB into the critical part of bootstrapping, and also causing a reboot of the bootstrap-host.
the bootstrap-host, doesn't use/need close tie to openshift binaries as control-plane host. we only use kubelet to run static pods, podman to run some pods.
So i don't get why there is requirement for pivot on the bootstrap-host??
bootstrap node would use kubelet/crio/machine-config-daemon from original AMI. That means fixes to these components would not be applied during bootstrap phase - that might be critical for some deployments
bootstrap node would use kubelet/crio/machine-config-daemon...
Is there an MCD baked into RHCOS? I'd be surprised if we ran one on the bootstrap machine. We certainly extract machine-config components from the target release image and run them, but the only bootimage exposure in that is podman
/kubelet
/crio
(and I have no cost/benefit opinion of pivoting for those ;).
So i don't get why there is requirement for pivot on the bootstrap-host??
First, this helps OKD which will use FCOS, which won't include a kubelet by default (at least, not right now).
Second, it does help avoid "bootimage drift" issues with the installer as noted also in the initial comment.
Further, the initial comment links to https://github.com/openshift/enhancements/pull/78#discussion_r337137313 - so perhaps to disintermediate we can summon @smarterclayton
Is there an MCD baked into RHCOS?
Yes.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten /remove-lifecycle stale
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen
.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Exclude this issue from closing again by commenting /lifecycle frozen
.
/close
@openshift-bot: Closing this issue.
/reopen
@LorbusChris: Reopened this issue.
/remove-lifecycle rotten
I'd love to see this happen. It would get us closer to boot images being "basic boot and pivot to expected content" in both the installer (bootstrap node) and the hosts themselves (which is the case today).
With some nontrivial but also not extremely difficult work, we could change the OS update stack to support an "update and restart all of userspace, but not the kernel" semantic which would shave some time off this.
pullling the machine-os-content on the bootstrap host and pivoting adding downloading 800MB into the critical part of bootstrapping
One thing also - it can't be that hard to teach the bootstrap host how to serve the images it pulled to the control plane - so if we did that it would reduce the 3 separate pulls of m-o-c
from the upstream registry to one. (And similar for other images)
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
/remove-lifecycle stale
/lifecycle frozen
See this thread, specifically this comment.
TL;DR - today the installer launches a bootimage, which is usually the pinned RHCOS version (for IPI installs), but can be different - see this issue which is about making it easier for people to find the correct bootimage.
See also https://github.com/openshift/installer/pull/2532
Filing this issue to track changing the installer to do a pivot on the bootstrap node.
I think the architecture would look like a new
bootstrap-pivot.service
betweenrelease-image.service
andbootkube.service
- we'd run the MCD on the bootstrap host, telling it to pivot to the targetmachine-os-content
.In fact...if we did this, we could streamline down the bootimages - e.g. no reason to ship kubelet/cri-o in the bootimages. (Though, we'd need new terminology like "bootRHCOS" to distinguish from "normal" RHCOS in
machine-os-content
or so?) Anyways, not required for this change.