operator-framework / java-operator-sdk

Java SDK for building Kubernetes Operators
https://javaoperatorsdk.io/
Apache License 2.0
801 stars 215 forks source link

Orphan secondary resources are left when using annotation based owner reference #2259

Open AleksanderBrzozowski opened 8 months ago

AleksanderBrzozowski commented 8 months ago

Bug Report

What did you do?

We have a primary resource, let's call it MyCustomResource. The reconciler for this resource creates and maintains a ConfigMap based on the MyCustomResource. The ConfigMap is created in a different namespace than MyCustomResource, thus we cannot use the ownerReference mechanism to ensure that the resource is deleted when the primary resource is deleted. To do so, we use JOSDK annotations (primary-name and primary-namespace).

The issue is that sometimes when the primary resource is deleted very quickly - right after it was created, the secondary resource is not created:

2024-03-04 10:12:17,918 ERROR [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-117) Error during event processing ExecutionScope{ resource id: ResourceID{name='test', namespace='default'}, version: 7364248}: java.lang.NullPointerException: Cannot invoke "io.fabric8.kubernetes.api.model.HasMetadata.addFinalizer(String)" because "r" is null
    at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.lambda$updateCustomResourceWithFinalizer$2(ReconciliationDispatcher.java:331)
    at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.conflictRetryingUpdate(ReconciliationDispatcher.java:353)
    at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.updateCustomResourceWithFinalizer(ReconciliationDispatcher.java:330)
    at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:116)
    at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:90)
    at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:64)
    at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:450)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)

So it seems that the secondary resource is created, even though the primary resource has no finalizer added.

This issue was reproduced with a sample repo. There is a repeated test that fails after several test loops - it is non deterministic.

What did you expect to see?

There are no Orphan secondary resources left, when the Primary is deleted.

What did you see instead? Under which circumstances?

There is an Orphan secondary resource left, that will be never deleted.

Environment

Kubernetes cluster type:

rancher desktop, version: 1.25

$ Mention java-operator-sdk version from pom.xml file

4.8.0

$ java -version

Java Temurin 17.0.6

$ kubectl version

kubectl version doesn't seem to be related, under the hood JOSDK is using fabric8 kubernetes client with version 6.10.0

Possible Solution

Make sure that the finalizer is applied first before creating secondary resources

csviri commented 8 months ago

Hi @AleksanderBrzozowski , thx for the issue,

Make sure that the finalizer is applied first before creating secondary resources

Actually, this should be the case. Will take a look.

csviri commented 8 months ago

This is the code part:

https://github.com/operator-framework/java-operator-sdk/blob/6d24455a4ee181cb1994cc4a89d4cf08fcdcc34c/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/event/ReconciliationDispatcher.java#L104-L125

AleksanderBrzozowski commented 8 months ago

Yeah, I saw this code part. But it seems that in some cases it doesn't work correctly. Have you tried running the test from my sample repo? πŸ™‚

csviri commented 8 months ago

reproduce it, seems that the cleanup is not called when it fails (at least in my case), while the finalizer is added.

If you log the resource in the reconciler the finalizer is always there.

csviri commented 8 months ago

Notes: it theoretically can happen that the ConfigMap is created but on cleanup is not in the cache yet, so it will not be deleted (this is a limitation and issue with this method - when resources are in different NS), but dependent resource mitigate / eliminate this issue, so in this : https://javaoperatorsdk.io/docs/dependent-resources#caching-and-event-handling-in-kubernetesdependentresource

So event this should not be the case.

AleksanderBrzozowski commented 8 months ago

Not sure that I follow πŸ˜›

So I guess that you were able to reproduce the issue by running a test?

Are you saying that we can somehow fix this issue by modifying the code? The ConfigMapDependentResource already extends KubernetesDependentResource class πŸ™‚

csviri commented 8 months ago

Are you saying that we can somehow fix this issue by modifying the code? The ConfigMapDependentResource already extends KubernetesDependentResource class πŸ™‚

No, not saying that :) still investigating, will probably take a while

AleksanderBrzozowski commented 8 months ago

No need to rush. We don't have this problem in production, we only discovered it when running integration tests, and we were able to solve it πŸ™‚

csviri commented 8 months ago

interesting is that I created a reproducer now in pure Java and not using Quarkus and not able to reproduce there: https://github.com/csviri/race-condition-reproducer

csviri commented 8 months ago

The problem seems to be that there is a Delete event received at some point. Which should not be there, this could be event from the previous test case. This si received while the reconciliation is running. But since the finalizer is not removed by the test reconciler for sure not sure how this can happen. (Maybe from previous test runs? - on the first check this does not seems to be the case)

csviri commented 8 months ago

The problem seems to be with either k8s API or informer:

2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-3) Event received for resource: ResourceID{name='test', namespace='default'} version: 3156 uuid: cd21c403-f3e4-4830-8efe-a597f77155c1 action: DELETED
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-3) Event Old resource: null,
 new resource: CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[mycustomresources.example.com/finalizer], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3156, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null}
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Received event: ResourceEvent{action=DELETED, associated resource id=ResourceID{name='test', namespace='default'}}

As can be seen here we receive a delete event for a resource that still has a finalizer.

cc @shawkins @manusa @metacosm

csviri commented 8 months ago

Full log of a failing test case:

2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-3) Event received for resource: ResourceID{name='test', namespace='default'} version: 3154 uuid: cd21c403-f3e4-4830-8efe-a597f77155c1 action: ADDED
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-3) Event Old resource: null,
 new resource: CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3154, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Received event: ResourceEvent{action=ADDED, associated resource id=ResourceID{name='test', namespace='default'}}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Marking event received for: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (-1660409789-pool-11-thread-3) Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (-1660409789-pool-11-thread-3) Resource found in cache: true for id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Executing events for custom resource. Scope: ExecutionScope{resource: null}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (pool-14-thread-6) Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (pool-14-thread-6) Resource found in cache: true for id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.ReconciliationDispatcher] (ReconcilerExecutor-mycustomresourcereconciler-139) Handling dispatch for resource test
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.ReconciliationDispatcher] (ReconcilerExecutor-mycustomresourcereconciler-139) Adding finalizer for resource: cd21c403-f3e4-4830-8efe-a597f77155c1 version: 3154
2024-03-04 14:32:33,831 DEBUG [io.jav.ope.pro.eve.ReconciliationDispatcher] (ReconcilerExecutor-mycustomresourcereconciler-139) Conflict retying update for: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,832 DEBUG [io.jav.ope.pro.eve.ReconciliationDispatcher] (ReconcilerExecutor-mycustomresourcereconciler-139) Trying to replace resource ResourceID{name='test', namespace='default'}, version: 3154
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-4) Event received for resource: ResourceID{name='test', namespace='default'} version: 3155 uuid: cd21c403-f3e4-4830-8efe-a597f77155c1 action: UPDATED
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-139) Event processing finished. Scope: ExecutionScope{ resource id: ResourceID{name='test', namespace='default'}, version: 3154}, PostExecutionControl: PostExecutionControl{onlyFinalizerHandled=false, updatedCustomResource=CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[mycustomresources.example.com/finalizer], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3155, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null}, runtimeException=null}
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-139) Cleanup for successful execution for resource: test
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (ReconcilerExecutor-mycustomresourcereconciler-139) Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-4) Event Old resource: CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3154, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null},
 new resource: CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[mycustomresources.example.com/finalizer], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3155, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null}
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (ReconcilerExecutor-mycustomresourcereconciler-139) Resource found in cache: true for id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-139) Rescheduling event for max reconciliation interval for resource: ResourceID{name='test', namespace='default'} : with delay: 36000000
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-4) Received event: ResourceEvent{action=UPDATED, associated resource id=ResourceID{name='test', namespace='default'}}
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-4) Marking event received for: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,840 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (-1660409789-pool-11-thread-4) Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (-1660409789-pool-11-thread-4) Resource found in cache: true for id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-4) Executing events for custom resource. Scope: ExecutionScope{resource: null}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (pool-14-thread-7) Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (pool-14-thread-7) Resource found in cache: true for id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.ReconciliationDispatcher] (ReconcilerExecutor-mycustomresourcereconciler-141) Handling dispatch for resource test
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.ReconciliationDispatcher] (ReconcilerExecutor-mycustomresourcereconciler-141) Reconciling resource test with version: 3155 with execution scope: ExecutionScope{ resource id: ResourceID{name='test', namespace='default'}, version: 3155}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.dep.wor.WorkflowReconcileExecutor] (ReconcilerExecutor-mycustomresourcereconciler-141) Submitting for reconcile: DependentResourceNode{ConfigMapDependentResource@6bff2a08} primaryID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.dep.wor.WorkflowReconcileExecutor] (ReconcilerExecutor-mycustomresourcereconciler-141) Submitted to reconcile: DependentResourceNode{ConfigMapDependentResource@6bff2a08} primaryID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.dep.wor.WorkflowReconcileExecutor] (pool-15-thread-8) Reconciling for primary: ResourceID{name='test', namespace='default'} node: DependentResourceNode{ConfigMapDependentResource@6bff2a08} 
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.eve.sou.inf.InformerEventSource] (pool-15-thread-8) Using PrimaryToSecondaryIndex to find secondary resources for primary: ResourceID{name='test', namespace='default'}. Found secondary ids: [] 
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.dep.AbstractDependentResource] (pool-15-thread-8) Creating 'test' ConfigMap for primary ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,841 DEBUG [io.jav.ope.pro.dep.kub.KubernetesDependentResource] (pool-15-thread-8) Creating target resource with type: class io.fabric8.kubernetes.api.model.ConfigMap, with id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-3) Event received for resource: ResourceID{name='test', namespace='default'} version: 3156 uuid: cd21c403-f3e4-4830-8efe-a597f77155c1 action: DELETED
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.sou.con.ControllerResourceEventSource] (-1660409789-pool-11-thread-3) Event Old resource: null,
 new resource: CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[mycustomresources.example.com/finalizer], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3156, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null}
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Received event: ResourceEvent{action=DELETED, associated resource id=ResourceID{name='test', namespace='default'}}
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Marking delete event received for: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,843 DEBUG [io.jav.ope.pro.eve.EventProcessor] (-1660409789-pool-11-thread-3) Cleaning up for delete event for: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,844 DEBUG [io.jav.ope.pro.eve.sou.inf.InformerEventSource] (-1660409789-pool-11-thread-4) On add event received for resource id: ResourceID{name='test', namespace='default'} type: ConfigMap version: 3157
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.eve.sou.inf.InformerEventSource] (-1660409789-pool-11-thread-4) Skipping event propagation for ADD, since was a result of a reconcile action. Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (pool-15-thread-8) Resource not found in temporary cache reading it from informer cache, for Resource ID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.eve.sou.inf.ManagedInformerEventSource] (pool-15-thread-8) Resource found in cache: true for id: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.dep.wor.WorkflowReconcileExecutor] (pool-15-thread-8) Setting already reconciled for: DependentResourceNode{ConfigMapDependentResource@6bff2a08} primaryID: ResourceID{name='test', namespace='default'}
2024-03-04 14:32:33,845 INFO  [MyCustomResourceReconciler] (ReconcilerExecutor-mycustomresourcereconciler-141) Reconciled. CustomResource{kind='MyCustomResource', apiVersion='example.com/v1alpha1', metadata=ObjectMeta(annotations={}, creationTimestamp=2024-03-04T13:32:33Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[mycustomresources.example.com/finalizer], generateName=null, generation=1, labels={}, managedFields=[], name=test, namespace=default, ownerReferences=[], resourceVersion=3155, selfLink=null, uid=cd21c403-f3e4-4830-8efe-a597f77155c1, additionalProperties={}), spec=null, status=null}
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-141) Event processing finished. Scope: ExecutionScope{ resource id: ResourceID{name='test', namespace='default'}, version: 3155}, PostExecutionControl: PostExecutionControl{onlyFinalizerHandled=false, updatedCustomResource=null, runtimeException=null}
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-141) Cleanup for successful execution for resource: test
2024-03-04 14:32:33,845 DEBUG [io.jav.ope.pro.eve.EventProcessor] (ReconcilerExecutor-mycustomresourcereconciler-141) Rescheduling 
csviri commented 8 months ago

Why the test case fails:

  1. we receive an update event after the finalizer added to the resource
  2. the reconciliation process starts
  3. meanwhile a Delete event received (for a resource with a finalizer 😱)
  4. Delete events has special meaning, it cleans up all kind of caches. So the cleaner part not even executed.
shawkins commented 8 months ago

meanwhile a Delete event received (for a resource with a finalizer 😱)

Just to make sure, are there any labels or other filters that the informer is based off of? If that's the case you will see a delete event when any of that no longer matches, not just when the resource no longer exists.

csviri commented 8 months ago

No there are no labels involved. This seems to be a race condition with the API server. If I add in a test waiting while the finalizer is actually added and just then call the delete, it does not fail:

  client.resource(resource).create()

  await().untilAsserted{
    val res = client.resources(MyCustomResource::class.java)
              .inNamespace("default")
              .withName("test")
              .get();
      assertEquals(res.metadata.finalizers.size,1)
  }

  Thread.sleep(5)

  // delete the primary, and wait until it is deleted
  client.resources(MyCustomResource::class.java)
      .inNamespace("default")
      .withName("test")
      .withTimeout(5, TimeUnit.SECONDS)
      .delete()

@AleksanderBrzozowski

csviri commented 8 months ago

Btw AFAIK Thread.sleep(5) maybe yields but does not wait 5ms - in general sleep under 15ms does not wait.

csviri commented 8 months ago

asked also for clarification on k8s slack: https://kubernetes.slack.com/archives/C0EG7JC6T/p1709561696968339