WIP: trying to consolidate okd and ocp builds into a single dockerfile

openshift-ci[bot] commented 6 months ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Prashanth684 Once this PR has been reviewed and has the lgtm label, please assign marsik for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/openshift/cluster-node-tuning-operator/blob/master/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment

Prashanth684 commented 6 months ago

/test okd-scos-images

jmencak commented 6 months ago

Thank you for your PR. I understand this is WiP, but when "consolidating" this, one thing to keep in mind should be that access to external repos (e.g. git clone by make update-tuned-submodule) will not work for NTO builds in locked down environments (ART pipeline).

openshift-ci[bot] commented 6 months ago

@Prashanth684: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).

Prashanth684 commented 6 months ago

Thank you for your PR. I understand this is WiP, but when "consolidating" this, one thing to keep in mind should be that access to external repos (e.g. git clone by make update-tuned-submodule) will not work for NTO builds in locked down environments (ART pipeline).

Thanks @jmencak for looking over this. The context for this change is https://issues.redhat.com/browse/OKD-210, where we are looking to build all OKD content on top of centos stream. As part of that change, ART had raised this PR which adds OKD configs based on centos stream. The source of truth for these configs is the ocp-build-data repo. In that repo, the dockerfile for NTO is specified as Dockerfile.rhel9, which is why we thought it would be better to consolidate, so that the autogeneration works for future releases. That said - I might need to understand a bit more about what is different for OKD. IIUC, is it:

patch the plugin bootloader file
different tuned settings for rhcos vs centos? Based on this, do you see any issues if we consolidate? Also is the submodule update currently being used, i.e is it built with a different submodule today for origin vs ocp?

jmencak commented 6 months ago

Thanks @jmencak for looking over this. The context for this change is https://issues.redhat.com/browse/OKD-210, where we are looking to build all OKD content on top of centos stream. As part of that change, ART had raised this PR which adds OKD configs based on centos stream. The source of truth for these configs is the ocp-build-data repo. In that repo, the dockerfile for NTO is specified as Dockerfile.rhel9, which is why we thought it would be better to consolidate, so that the autogeneration works for future releases.

Thank you for additional context @Prashanth684 . The Dockerfiles in this repository do need some cleanup. At a minimum, in my view:

Dockerfile.rhel8 should be safeily removed now
We should stop using Dockerfile.rhel<number> and just go for Dockerfile.rhel. RHEL versioning should be done using by git inside the content of the file. I understand that there are repos that reference Dockerfile.rhel9. For that, we could just symlink Dockerfile.rhel9 -> Dockerfile.rhel and after the PR merges go through the repos and s/Dockerfile.rhel9/Dockerfile.rhel/. This is the pattern many core operators use. We could also use something else, such as s/Dockerfile.rhe9/Dockerfile.ocp/.

That said - I might need to understand a bit more about what is different for OKD. IIUC, is it:

patch the plugin bootloader file

The patch should go. There shouldn't be any patches to TuneD in this repo. This is a "heritage" before we started using TuneD Fast Datapath (FDP) RPM package delivery. In my view, the Dockerfile gives a chance to developers to build the project completely together with custom TuneD changes and build all the TuneD rpm packages. RHEL/CentOS stream does not ship all of them! TuneD FDP releases, on the other hand, do. As you probably cannot use TuneD FDP packages in CentOS stream, you have to build them. The botom line, feel free to remove that patch in this PR. If not, we will remove it in the future.

different tuned settings for rhcos vs centos?

We don't do any of those at the moment to my knowledge. If different tuning is needed, this could be handled at TuneD profile level adjusting their content.

Based on this, do you see any issues if we consolidate?

The proposed change complicates the Dockerfile.rhel9 a lot and makes it hard to read. Therefore, I'm not in favour of this change at the moment. I'm not sure I fully understand the motivation, i.e. why OKD cannot just use the uptream Dockerfile.

Also is the submodule update currently being used, i.e is it built with a different submodule today for origin vs ocp?

Do you mean the [submodule "assets/tuned/tuned"]? If so, OCP does not use the submodule at all. I'm not familiar with origin. Origin uses it if it builds NTO via using Dockerfile.

jupierce commented 6 months ago

@jmencak -

The proposed change complicates the Dockerfile.rhel9 a lot and makes it hard to read. Therefore, I'm not in favour of this change at the moment. I'm not sure I fully understand the motivation, i.e. why OKD cannot just use the uptream Dockerfile.

In order to support the entire organization building OKD artifacts, the OKD team must rely on automation to accomplish most of the work. The automation being designed is based on OCP ART metadata (e.g. https://github.com/openshift-eng/ocp-build-data/blob/openshift-4.16/images/cluster-node-tuning-operator.yml#L23-L26 ). We can use this metadata to generate, and keep in sync, the hundreds of configurations necessary to produce OKD with the lowest possible impact to OCP teams.

The cluster-node-tuning operator appears to be one of the few where this process will not work. Dockerfile contains 3 FROM stages, while Dockerfile.rhel9 (and ART metadata) defines two. We lack the necessary metadata to programmatically create valid OKD build configurations.

The simplest way to address this and support the OKD team is to unify the Dockerfiles. Once that is done, the same automation that supports the rest of the organization's OKD artifacts will keep your Dockerfile.rhel9 (or future unified variant) building and promoting compliant OKD images.

The next simplest approach is to ensure that the number of layers (and intention of those layers) is consistent between Dockerfile and Dockerfile.rhel9. This will permit the organizational automation to work, but exposes us to drift in the future (e.g. will the team remember to update one when they update the other).

The final option is to exclude NTO from the automation. This transfers the full responsibility of building OKD images and keeping them compliant to the NTO team (e.g. keeping up with centos based golang builders when they are available).

As I'm sure you can appreciate, anything other than full automation creates a tax for someone and risk for OKD. I hope that helps explain the motivation.

jmencak commented 6 months ago

Thank you for additional context Justin.

The next simplest approach is to ensure that the number of layers (and intention of those layers) is consistent between Dockerfile and Dockerfile.rhel9

This is where again, I'm a bit lost. Is this about the difference in the number stages (FROM) or layers (RUN) or both?

/cc @yanirq /cc @MarSik /cc @ffromani to share their views. My current concerns with this change:

Big Dockerfile which is hard to read.
Increased number of layers for both OCP and OKD images and therefore bigger images (how much needs testing/verification).

jupierce commented 6 months ago

FROM statements:

fwiw, the number of stages in a Dockerfile doesn't impact the final size of the image -- only the content of the final stage matters.

If it is the content in the Dockerfile, extracting this to a simple shell script (e.g. ADD base_os_prep.sh, RUN base_os_prep.sh) could encapsulate the differences more clearly.

Prashanth684 commented 6 months ago

The patch should go. There shouldn't be any patches to TuneD in this repo. This is a "heritage" before we started using TuneD Fast Datapath (FDP) RPM package delivery. In my view, the Dockerfile gives a chance to developers to build the project completely together with custom TuneD changes and build all the TuneD rpm packages. RHEL/CentOS stream does not ship all of them! TuneD FDP releases, on the other hand, do. As you probably cannot use TuneD FDP packages in CentOS stream, you have to build them. The botom line, feel free to remove that patch in this PR. If not, we will remove it in the future.

Based on this comment @jmencak , it looks like the patch can be removed in Dockerfile, but the FROM statement still has to stay to build the tuned rpms for centos? (because FDP is not available for centos)

jmencak commented 6 months ago

FROM statements:

* https://github.com/openshift/cluster-node-tuning-operator/blob/5a9355ec2c462d83812184c83deb209aa5762801/Dockerfile.rhel9 has 2

* https://github.com/openshift/cluster-node-tuning-operator/blob/5a9355ec2c462d83812184c83deb209aa5762801/Dockerfile has 3

So the number of stages (FROM) causes the problem for the automation, not (RUN) layers. Thank you for confirming this.

fwiw, the number of stages in a Dockerfile doesn't impact the final size of the image -- only the content of the final stage matters.

Yes, if you build the packages you need during a build stage, you'll have a smaller final image. That's why we're doing the 3 stage build. Every extra RUN (layer) is usually bad news for the resulting image size. And there are a few of them in this PR. This PR also doesn't clean the dependencies needed for the build, so you'll end up with a huge CS9 image.

This is a quick experiment I did:

Size of NTO images prior this PR:

473MB: cs9
639MB: rhel9

Size of NTO images with this PR:

1.36GB: cs9-new 
659MB: rhel9-new

So a huge increase in size for the CS9 NTO image, not so big for RHEL9, but there is still some.

I believe we need to limit the number of RUN statements in the dockerfile. podman history and podman inspect should help. Also, at least removing the build deps for CS9 build could help reduce the resulting images size. I'm actually out sick (yesterday/today), so please bear with me if I don't respond quickly enough.

jmencak commented 6 months ago

The patch should go. There shouldn't be any patches to TuneD in this repo. This is a "heritage" before we started using TuneD Fast Datapath (FDP) RPM package delivery. In my view, the Dockerfile gives a chance to developers to build the project completely together with custom TuneD changes and build all the TuneD rpm packages. RHEL/CentOS stream does not ship all of them! TuneD FDP releases, on the other hand, do. As you probably cannot use TuneD FDP packages in CentOS stream, you have to build them. The botom line, feel free to remove that patch in this PR. If not, we will remove it in the future.

Based on this comment @jmencak , it looks like the patch can be removed in Dockerfile, but the FROM statement still has to stay to build the tuned rpms for centos? (because FDP is not available for centos)

Yes. Actually, I wonder if the way forward is to make the Dockerfile.rhel9 a 3 stage one. That would help reducing the size of the NTO OKD image and might even make the Containerfile more readable?

jupierce commented 6 months ago

@jmencak - https://github.com/openshift/cluster-node-tuning-operator/pull/1058 gets the OKD image size down to 495MB. dnf build-dep tuned.spec -y installs whatever it wants, so dnf --setopt=protected_packages= history -y undo 0 is used to reverse it once the RPM build is complete. This alternative does away with most of them, but regarding 'RUN', CI (and therefore OKD builds) will always squash.

openshift / cluster-node-tuning-operator

WIP: trying to consolidate okd and ocp builds into a single dockerfile #1051