Open travier opened 11 months ago
I think we may have discussed this in the past and we landed on not increasing it because any reasonable limit we choose will not be enough for some users. Users are encouraged to chain with other external Ignition configs and resources. I guess in this case they're hitting a network bootstrapping issue?
In this case (agent-based installer) there's nowhere external to chain to. It's not so much a network bootstrapping issue as these manifests are only used to set up the SDN overlay once the cluster is up. It's that we have to get all of the configuration files for the cluster provided by the user into the live ISO that can then be carried into a disconnected environment.
Maybe we should add a mechanism to embed arbitrarily sized files at the end of the LiveISO so that we do not rely on a fixed sized blob.
Maybe we should add a mechanism to embed arbitrarily sized files at the end of the LiveISO so that we do not rely on a fixed sized blob.
That will be helpful in general, and it will allow the user to apply his/her own customization
It's that we have to get all of the configuration files for the cluster provided by the user into the live ISO that can then be carried into a disconnected environment.
Can you clarify what these config files are? Looking at the linked Jira ticket, I only found installer manifests and didn't quite follow why those need to be part of the live ISO. In this disconnected environment, is e.g. one node selected as "the installer node" that the other nodes connect to via the agent?
Can you clarify what these config files are? Looking at the linked Jira ticket, I only found installer manifests and didn't quite follow why those need to be part of the live ISO.
The agent-based installer supports the OpenShift installer cluster customization, so that the user can specify a number of additional manifests that will be included during the initial installation (day1). Since the agent-based installer produces an ISO, those extra manifests need to be included in it. Sean may provide more details about the specific case, but in general this approach could be used for day1 customizations.
In this disconnected environment, is e.g. one node selected as "the installer node" that the other nodes connect to via the agent?
Yes. In both connected/disconnected environments, the rendezvous node is the ephemeral orchestrator node that will manage the cluster installation.
cc @seanmerrow
Hmm OK, so we're trying to fit possibly numerous cluster object definitions into the live ISO. ISTM like bumping the pre-allocated space to 1M would be more of a stopgap solution, would you agree?
Would it make sense to consume the manifests as a container image instead? Then the installer (or more likely, the code that orchestrates it) could pull it down and unpack it. Even in a disconnected install, the nodes must have access to an image registry containing the mirrored release payload images and the user's own workload images, right?
I don't think that consuming the manifests from a container approach will work, at least for the agent-based installer point of view.
In such workflow, the user prepares a (single) live ISO by running the openshift-install agent create image
command, and the ISO will contain all the necessary elements (in particular, a set of specialized services) to orchestrate the installation when booted (including the extra manifests eventually specified by the user). Note that the ISO could be prepared into an environment completely different from the one where it will be applied.
The suggested mechanism from @travier to embed arbirtraly sized files at the end of the LiveISO looks to me a better fit for this use case.
IMHO, what @travier suggested is the best solution among what we have discussed. Bumping up the pre-allocated size would not be a long term solution - taking the example that we were initially discussed about increasing from 256K to 1M because Calico manifests were 512K, however that 1M wouldn't work because Juniper CN2 CNI is already over 1.25M when we tried it with agent-based installer. ------quoted---- DEBUG trying iso9660 with physical block size 0 ERROR failed to write asset (Agent Installer ISO) to disk: cannot generate ISO image due to configuration errors FATAL failed to fetch Agent Installer ISO: failed to generate asset "Agent Installer ISO": failed to create overwrite reader for ignition: content length (1312018) exceeds embed area size (262144) [root@b1s7-node3 agent-based-installer]#
We discussed this during the community meeting today:
12:52:22 dustymabe | #agreed We consider our ISO to already be a
| fragile piece of our architecture and would
| prefer to limit changes to it. We will try
| to meet with the Assisted Installer
| (OpenShift) team to understand the use
| cases more to see if there are alternative
| solutions to this problem.
@travier has agreed to organize this meeting.
Ultimately I think the flow for nontrivial ISO things should be the same as layering: build a bootable container image, and then pass it to a tool like osbuild which makes a custom ISO from it.
Another limitation of the agent-based installer is that it is part of the OpenShift installer - a single statically-linked binary with ideally no dependencies, that runs on any flavour of Linux and also MacOS. Vendoring in something like skopeo would be painful. Depending on external Python tools like osbuild is a non-starter.
Do we have an update on if a solution was proposed/decided ? Thank you
In such workflow, the user prepares a (single) live ISO by running the
openshift-install agent create image
command, and the ISO will contain all the necessary elements (in particular, a set of specialized services) to orchestrate the installation when booted (including the extra manifests eventually specified by the user). Note that the ISO could be prepared into an environment completely different from the one where it will be applied.
One low-tech solution here is to have openshift-install agent create image
take a --remote-ignition 'http://...'
switch which tells the code to embed in the ISO an Ignition that fetches from the given URL. It then also spits out the Ignition config that the user must host at that URL.
The installer could detect the condition when the Ignition config is too large and give an error message that suggests using --remote-ignition
.
This is analogous to coreos-installer iso extract minimal-iso
which takes a --rootfs-url URL
and an --output-rootfs PATH
; it takes out the rootfs from the ISO and writes it to PATH
and adds a coreos.live.rootfs_url
karg to the minimal ISO pointing at URL
. It's the user's responsibility to have the given rootfs hosted at that URL.
Describe the bug
The current size is about 256KB (https://github.com/coreos/coreos-assembler/blob/main/src/cmd-buildextend-live#L113C16-L113C16) and some uses cases require more (see https://issues.redhat.com/browse/OCPBUGS-20177).
Should we increase the size of this embed Ignition file or should we suggest they use something else?
Reproduction steps
Embed a "bigger"" file (1MB) in the LiveISO.
Expected behavior
It works for 1MB Ignition configs.
Actual behavior
It fails for 1MB Ignition configs.
System details
LiveISO
Butane or Ignition config
No response
Additional information
No response