kubernetes-client / java

Official Java client library for kubernetes
http://kubernetes.io/
Apache License 2.0
3.46k stars 1.84k forks source link

Kubectl.apply fails while GenericKubernetesApi works for ClusterRole #3518

Open agustinventura opened 3 days ago

agustinventura commented 3 days ago

Describe the bug
When creating a ClusterRole Kubectl.apply returns error while GenericKubernetesApi creates it.

Client Version
18.0.0 - 21.0.0-legacy

Kubernetes Version
1.27 (OpenShift 4.14.16)

Java Version
Java 21

To Reproduce
Given a ClusterRole in clusterrole.yaml file:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: test-controller-admin
rules:
- apiGroups:
  - '*'
  resources:
  - '*'
  verbs:
  - '*'
- nonResourceURLs:
  - '*'
  verbs:
  - '*'

And a kubeconfig for an OpenShift 4.14.16 in kubeconfig file, load both files and try to apply ClusterRole:

public void createObjectsInCluster() throws URISyntaxException, IOException, KubectlException {  
  String kubeconfig = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("kubeconfig").toURI())));  
  String ensManifest = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("clusterrole.yaml").toURI())));  
  createObjects(kubeconfig, ensManifest);  
}  

public void createObjects(String kubeconfig, String manifest) throws IOException, KubectlException {  
  KubeConfig kc = KubeConfig.loadKubeConfig(new StringReader(kubeconfig));  
  ApiClient apiClient = Config.fromConfig(kc);  
  apiClient.setVerifyingSsl(false);  

  List<Object> objects = Yaml.loadAll(manifest);  
  for (Object object : objects) {  
    final KubernetesObject kubernetesObject = (KubernetesObject) object;  
    Class<KubernetesObject> objectClass = (Class<KubernetesObject>) kubernetesObject.getClass();  
    Kubectl.apply(objectClass).apiClient(apiClient).resource(kubernetesObject).execute();  
  }  
}

This will return the following error:

io.kubernetes.client.extended.kubectl.exception.KubectlException: io.kubernetes.client.openapi.ApiException: class V1Status {
    apiVersion: v1
    code: 404
    details: class V1StatusDetails {
        causes: null
        group: authorization.openshift.io
        kind: clusterroles
        name: test-controller-admin
        retryAfterSeconds: null
        uid: null
    }
    kind: Status
    message: clusterroles.authorization.openshift.io "test-controller-admin" not found
    metadata: class V1ListMeta {
        _continue: null
        remainingItemCount: null
        resourceVersion: null
        selfLink: null
    }
    reason: NotFound
    status: Failure
}

However, if ClusterRole is created using GenericKubernetesApi it succeeds:

public void createObjectsInCluster() throws URISyntaxException, IOException, KubectlException {  
  String kubeconfig = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("kubeconfig").toURI())));  
  String ensManifest = new String(Files.readAllBytes(Paths.get(ClassLoader.getSystemResource("clusterrole.yaml").toURI())));  
  createObjects(kubeconfig, ensManifest);  
}  

public void createObjects(String kubeconfig, String manifest) throws IOException {  
  KubeConfig kc = KubeConfig.loadKubeConfig(new StringReader(kubeconfig));  
  ApiClient apiClient = Config.fromConfig(kc);  
  apiClient.setVerifyingSsl(false);  

  List<Object> objects = Yaml.loadAll(manifest);  
  for (Object object : objects) {  
    final V1ClusterRole clusterRole = (V1ClusterRole) object;  
    GenericKubernetesApi<V1ClusterRole, V1ClusterRoleList> clusterRoleClient =  
        new GenericKubernetesApi<>(V1ClusterRole.class, V1ClusterRoleList.class, "rbac.authorization.k8s.io", "v1", "clusterroles",  
            apiClient);  
    KubernetesApiResponse<V1ClusterRole> createClusterRoleResponse = clusterRoleClient.create(clusterRole);  
    if (createClusterRoleResponse.getStatus() != null && !createClusterRoleResponse.isSuccess()) {  
      if (createClusterRoleResponse.getHttpStatusCode() != 409) {  
        log.error("Error creating k8s object {}: {}", clusterRole, createClusterRoleResponse.getStatus());  
      } else {  
        log.info("k8s object {} already exists", clusterRole);  
      }  
    }  
  }  
}

If applying clusterrole.yaml with kubectl cli it suceeds too:

 kubectl --kubeconfig kubeconfig apply -f clusterrole.yaml

Expected behavior
Kubectl.apply should create ClusterRole.

Server (please complete the following information):

Additional context
This looks like some weird interaction with OpenShift, as we found it upgrading from 4.12.25 to 4.14.16 and didn't have this kind of issues with other platforms such as EKS. The error itself returns group authorization.openshift.io instead of rbac.authorization.k8s.io pointing to some specific OpenShift logic but however it is strange that kubectl cli and the GenericKubernetesApi works and only fails when using kubectl Java client. We also tried to add the group in the error to ModelMapper:

ModelMapper.addModelMap("authorization.openshift.io", "v1", "ClusterRole", "clusterroles", false, V1ClusterRole.class);

But returns the same error. It fails too for RoleBinding objects but not for Role or ClusterRoleBinding ones.

brendandburns commented 3 days ago

You're getting a 404 on the apply. Does this object exist currently in the cluster? Or are you trying to create it for the first time?

My first guess is that Kubectl.Apply doesn't handle object creation correctly, there's probably special purpose code in kubectl cli to handle that case.

agustinventura commented 3 days ago

Hi Brendan, it's the first I create it.

brendandburns commented 3 days ago

Also, I see clusterroles.authorization.openshift.io in the 404 error message, but it looks from your YAML like you are trying to create a rbac.authorization.k8s.io/v1 is it possible there is a typo in your YAML somewhere?

brendandburns commented 3 days ago
But returns the same error.
It fails too for RoleBinding objects but not for Role or ClusterRoleBinding ones.

When you say that it works for Role, does it successfully create new Role resources that didn't previously exist?

brendandburns commented 3 days ago

The Apply code is just calling ServerSide apply, so we may need to special case that code if it returns a 404.

brendandburns commented 3 days ago

(sorry for lots of little questions :)

Can you try kubectl apply --server-side ... and see if it works?

agustinventura commented 3 days ago

No problem Brendan, I've already tried lots of them. There's no typo in the yaml, I've tried to create a bunch of objects together in one yaml, independent objects in different yamls and even creating the objects in Java instead of reading from a yaml, result has always been the 404 error with this change of rbac.authorization.k8s.io to clusterroles.authorization.openshift.io. I can create previously non existing Role resources or ClusterRoleBinding ones. I always test with a newly provisioned cluster as it is our current use case. I noticed last week that Apply is using --server-side, so I tested it and it works, creates the ClusterRole. I didn't mention it because I didn't consider it interesting, sorry for any inconvenience.

brendandburns commented 1 day ago

Ok, thanks for the details, this is odd. But I think I may have figured it out. We only supply the type, not the list type when we get the generic API in KubectlApply and then we guess at the list type via the class loader:

https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/kubectl/Kubectl.java#L247

My guess is that the class for the openshift ClusterRole is being discovered before the class for the standard ClusterRole and that is confusing things.

If you are willing it would be super useful if you could recompile the library with some logging around that code location to verify that's what's going on, that's great.

If not I can try to add some defensive code around that location that would fix the problem.

agustinventura commented 1 day ago

Hi Brendan, I'm happy to help anyway I can. I've added the log but discovered that execution path is not getting into the getGenericApi method at line 246 but in the one at line 269 with following parameters:

apiTypeClass = class io.kubernetes.client.openapi.models.V1ClusterRole apiListTypeClass = interface io.kubernetes.client.common.KubernetesListObject

The resolved groupVersionResource seems to be the problem, containing:

resource = clusterroles group = authorization.openshift.io version = v1

As I can see in ModelMapper's classesByGVR kvMap there's two entries for ClusterRole, one with key rbac.authorization.k8s.io and one with authorization.openshift.io. When vkMap gets build there's only left the authorization.openshift.io one. This happens too with RoleBinding. Anecdotally, the same happens with ClusterRoleBinding and Role but we are lucky enough that rbac.authorization.k8s.io is the one that gets loaded in vkMap.

Any thoughts on how can I address this? Thanks a lot for your help.