operator-framework / java-operator-sdk

Java SDK for building Kubernetes Operators
https://javaoperatorsdk.io/
Apache License 2.0
800 stars 215 forks source link

operator-framework-junit-5 version 4.4.0 doesn't work with fabric8 version 6.8.0 #1978

Closed bachmanity1 closed 1 year ago

bachmanity1 commented 1 year ago

Bug Report

I'm using josdk version 4.4.0 with fabric8 version 6.8.0 overall everything works as expected except for e2e tests. Running e2e tests gives this error:

image

I think this is related to the fact that josdk v4.4.0 internally depends on fabric8 v6.7.2. I'm not sure why this problem happens because my project has a direct dependency on fabric8 and I thought all transitive dependencies will use version specified by the direct dependency.

Here is relevant part of the dependency tree:

image.

@csviri I think to reproduce this problem you should edit pom file of example operators to use fabric8 v6.8.0 and josdk v.4.4.0.

The problem is resolved when I run mvn install -Dmaven.test.skip=true using branch fabric-6.8 and use version 4.4.1-SNAPSHOT in my project. So, maybe I just should wait for the release of next version.

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

Environment

Kubernetes cluster type:

$ Mention java-operator-sdk version from pom.xml file

$ java -version

$ kubectl version

Possible Solution

Additional context

csviri commented 1 year ago

The error happens in the Junit extension, the thing is that it not happens for the internal ITs at least when the client is upgraded: https://github.com/operator-framework/java-operator-sdk/pull/1975

If you could create/opensource a sample to reproduce that would be a big help

bachmanity1 commented 1 year ago

If you could create/opensource a sample to reproduce that would be a big help

I think to reproduce this error you could just use any of the sample operators under sample-operators directory, just make sure that pom file uses josdk version 4.4.0 and fabric8 version 6.8.0 and then try to run the e2e test.

csviri commented 1 year ago

As you can see this build passes, that uses those version: https://github.com/operator-framework/java-operator-sdk/pull/1975/files All the samples have end to end tests using that junit extension, and it does not fail at all.

bachmanity1 commented 1 year ago

As you can see this build passes, that uses those version: https://github.com/operator-framework/java-operator-sdk/pull/1975/files All the samples have end to end tests using that junit extension, and it does not fail at all.

yes, I saw that but I don't think that's what I'm trying to tell you. In the e2e tests mentioned by you, josdk has an internal dependency on f8 version 6.8.0 but to reproduce this bug it should have internal dependency on versioin 6.7.2 and sample operator should have direct dependency on version 6.8.0.

csviri commented 1 year ago

Yes, but that we don't support, so out of the box what is supported is the fabric8 client version comes with the framework. We will release new version with v6.8.0 soonish (will see based also on Quarkus side).

If you provide a minimalist example, I can still take a quick look to help with that..

bachmanity1 commented 1 year ago

@csviri I've added a simple example that reproduces this error. https://github.com/bachmanity1/josdk-test

csviri commented 1 year ago

I added a PR:

changing usage of builder from:


     new NamespaceBuilder()
               .withNewMetadata().withName(namespace).endMetadata()
               .build()

to:


new NamespaceBuilder().withMetadata(
                    new ObjectMetaBuilder()
                            .withName(namespace)
                            .build())
                .build())

That seems to be fixing this problem.

tbh I still not fully understand why this happens, and certainly not happens in the tests.

@manusa @shawkins might be interested.

manusa commented 1 year ago

If I'm understand correctly this might happen when mixing different Fabric8 Kubernetes Client versions (more specifically 6.7.2 and 6.8.0) but it's not reproducible when the Operator is directly using 6.8.0, right?

The problem might be explained through the changes in sundrio and the removal/refactor of the Fluent interface. So probably the nested buildables are incompatible.

If it does happen when the client is only 6.8.0, then we might have a big problem. However, I haven't seen this behavior reproduced elsewhere and it does seem to be a simple and common use case.

csviri commented 1 year ago

If I'm understand correctly this might happen when mixing different Fabric8 Kubernetes Client versions (more specifically 6.7.2 and 6.8.0) but it's not reproducible when the Operator is directly using 6.8.0, right?

no, or at least no in our samples

shawkins commented 1 year ago

If it does happen when the client is only 6.8.0, then we might have a big problem.

I can confirm it does not.

So probably the nested buildables are incompatible.

It's hard wrap my head around what would be happening here - all of the relevant classes would be coming from a single jar kubernetes-model-core, that would have the model objects, the sundrio package, and the generated annotation classes.

manusa commented 1 year ago

It's hard wrap my head around what would be happening here - all of the relevant classes would be coming from a single jar kubernetes-model-core, that would have the model objects, the sundrio package, and the generated annotation classes.

I only trust maven-enforcer-plugin for that :)

Although it's not exactly the same, I do see similar problems when mixing buildables created from different sundrio versions in this PR: https://github.com/dekorateio/dekorate/pull/1233 (check description bullet point 1).

shawkins commented 1 year ago

@csviri are you sure about committing this fix? It seems like if this were happening for the namespace builder it could happen all of the place with mixing versions.

csviri commented 1 year ago

@shawkins the thing is that on all other places I use this style of metadata creation, so I think it is not a sin to merge this :)

csviri commented 1 year ago

But as a I mentioned above, we don't support this scenario explicitly.

tomdw commented 1 year ago

We will release new version with v6.8.0 soonish (will see based also on Quarkus side).

@csviri when is a version with this upgrade planned? will also the kubernetes-webhooks-framework-core be upgraded then?

csviri commented 1 year ago

Hi @tomdw plan to do both next week.

csviri commented 1 year ago

@tomdw regarding the JOSDK and QOSDK there are issues with fabric8 v6.8.0 this might be delayed, on webhook side, I will take a look soon.

tomdw commented 1 year ago

@csviri thanks for the update, hope it gets fixed soon

csviri commented 1 year ago

our best men are working on it :) currently looking into admission controller, but might stay on f8 v6.7.* until that time.

csviri commented 1 year ago

@tomdw FYI: https://github.com/operator-framework/josdk-webhooks/releases/tag/v1.1.2

tomdw commented 1 year ago

@csviri thank for the web hook update, hopefully the operator sdk upgrade itself gets resolved too

tomdw commented 1 year ago

@csviri this issue is closed but the pom still shows that version 6.8.0 is not yet used in the latest releases, see https://github.com/operator-framework/java-operator-sdk/blob/a6a57afd0079f4b64fed42292ef1298e3d7d5187/pom.xml#L47

should it be reopened?

csviri commented 1 year ago

Hi, will, be in next release. V4.5

pmig commented 1 year ago

@csviri it seems that the fabric8 version upgrade to 6.8 didn't make it into v4.5 any plans on inlcuding it in a bugfix version as 6.9.1 is already released?

csviri commented 1 year ago

Hi yes, it should come soon.