openshift-metal3 / dev-scripts

Scripts to automate development/test setup for openshift integration with https://github.com/metal3-io/
Apache License 2.0
93 stars 185 forks source link

CORS-2969: Don't default to OpenShiftSDN after deprecation #1622

Closed zaneb closed 7 months ago

zaneb commented 8 months ago

OpenShiftSDN is being deprecated in 4.15, so switch the default to OVNKubernetes in the IPv4-only case (OVNk is already required for IPv6/dual-stack) to avoid installer failures.

This follows up on 8537fdd1e01fa07caa7f09ed03c6beafdacb9378, which sets OpenShiftSDN in the manifests (but not the install-config) iff it is explicitly requested.

zaneb commented 8 months ago

/cc @danwinship @bfournie

danwinship commented 8 months ago

"Fixes #1613"

But https://github.com/openshift/release/pull/46286 and https://github.com/openshift/release/pull/46338 should probably merge before this, to ensure that no CI jobs unexpectedly switch network types when the default changes.

zaneb commented 8 months ago

@danwinship Fixed the control flow, hopefully this actually works for the agent as well.

But https://github.com/openshift/release/pull/46286 and https://github.com/openshift/release/pull/46338 should probably merge before this, to ensure that no CI jobs unexpectedly switch network types when the default changes.

Can't we merge this and abandon all of https://github.com/openshift/release/pull/46563, https://github.com/openshift/release/pull/46286, and https://github.com/openshift/release/pull/46338?

From my reading of the commit messages, there's a bunch of jobs that don't have -sdn- in the title where we don't actually care what network type they have. With this change they'll carry on as they always have on existing release branches, but switch to ovnk on new ones. It seems to me that that's what we want, and it avoids renaming everything in the world (multiple times).

danwinship commented 8 months ago

In some cases it definitely makes sense to convert the SDN job to OVN-K (eg, the "virtualmedia" jobs, https://github.com/openshift/release/pull/46338).

But there are some places where there are two jobs that are completely identical except that one of them is sdn and one is ovn, so the right fix is to drop the sdn job. (I'm not sure if that case applies to any dev-scripts-based jobs though.)

The trickier case is where there are two jobs that are nearly identical, like e2e-metal-ipi-sdn (-ipv4) vs e2e-metal-ipi-ovn-ipv6, where it's not clear if we still care about testing both IPv4 and IPv6 if we're not also testing both network plugins. And that seemed like a decision that was better to leave up to the individual repo owners to figure out.

and it avoids renaming everything in the world (multiple times).

https://github.com/openshift/release/pull/46563 and https://github.com/openshift/release/pull/46286 don't rename any jobs; they just change job configs to make it clearer when a job ends up using sdn.

(https://github.com/openshift/release/pull/46338 does rename jobs, but that's because it's actually flipping jobs from sdn to ovn.)

danwinship commented 8 months ago

And that seemed like a decision that was better to leave up to the individual repo owners to figure out.

(of course, you are "the individual repo owners" for some of these jobs, so if you want to figure that now, go ahead...)

rccrdpccl commented 8 months ago

/test e2e-agent-compact-ipv4

bfournie commented 8 months ago

/approve

openshift-ci[bot] commented 8 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bfournie

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift-metal3/dev-scripts/blob/master/OWNERS)~~ [bfournie] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
bfournie commented 8 months ago

/cc @andfasano

andfasano commented 8 months ago

/hold

Currently there are two separate issues (but somehow related) requiring to be addressed:

  1. As previously mentioned by @bfournie, the openshift_version function is not defined in network.sh and so it's not found
  2. The current dev-scripts CI configuration was still pointing at 4.14, thus making pass the current builds. I'll open a new PR to fix this point
andfasano commented 8 months ago

Requires https://github.com/openshift/release/pull/47487

danwinship commented 8 months ago

But openshift/release#46286 and openshift/release#46338 should probably merge before this, to ensure that no CI jobs unexpectedly switch network types when the default changes.

done

"Fixes #1613"

(as in, "you should add that to the initial comment so it closes that issue when this merges")

andfasano commented 8 months ago

/test e2e-agent-compact-ipv4

andfasano commented 7 months ago

/hold cancel

Now the agent job is failing as expected

zaneb commented 7 months ago

The dependencies here are much nastier than I realised, and I'm almost ready to give up on this change. This whole pile of bash is completely unmaintainable :cry:

andfasano commented 7 months ago

/lgtm