redhat-cop / resource-locker-operator

Apache License 2.0
30 stars 14 forks source link

Resources patched even after deletion #23

Closed BostjanBozic closed 4 years ago

BostjanBozic commented 4 years ago

Hello,

what I saw is that resources, that were set to be patched, are patched even after related ResourceLocker object is deleted.

Example:

Below is operator's log after ResourceLocker has been deleted and node label adjusted:

{"level":"info","ts":1595334244.161712,"logger":"resourcelocker-controller","msg":"Reconciling ResourceLocker","Request.Namespace":"test","Request.Name":"test-patch"}
{"level":"info","ts":1595334396.5861855,"logger":"resourcelocker-controller","msg":"Reconciling ResourceLocker","Request.Namespace":"test","Request.Name":"test-patch"}
{"level":"info","ts":1595334396.6151645,"logger":"resourcelocker-controller","msg":"Reconciling ResourceLocker","Request.Namespace":"test","Request.Name":"test-patch"}
{"level":"info","ts":1595334414.0586982,"logger":"resourcelocker-controller","msg":"Reconciling ResourceLocker","Request.Namespace":"test","Request.Name":"test-patch"}
{"level":"info","ts":1595334414.0847456,"logger":"resourcelocker-controller","msg":"Reconciling ResourceLocker","Request.Namespace":"test","Request.Name":"test-patch"}

Only way that I was able to clear the "cache" was to delete operator pod.

Additional information:

raffaelespazzoli commented 4 years ago

yes, there is no way to undo a patch at the moment. This is explained in the docs. Conceptually I am not sure it will ever be possible to undo a patch...what do you replace the patch with ...

BostjanBozic commented 4 years ago

Would it be possible just to keep the object it was before deleting ResourceLocker? As mentioned, issue is that operator keeps patching the target resource, eventhough ResourceLocker is deleted, at least until operator pod is restarted.

raffaelespazzoli commented 4 years ago

Is see now, I misunderstood your issue. I believe what you see if a manifestation of a regression that was introduced in the latest release and it is fixed now in master. Would you be able to test building the operator from master?

BostjanBozic commented 4 years ago

I've build an image from master and tested it. I have deployed it using manifests in deploy/ and it is running without issues.

But for some reason it is not performing any patching. It responds that it started patching controller, but nothing is done there. Maybe I am missing something, but this is the example of label patching that I am trying to implement (this works on operator v0.1.2):

apiVersion: v1
kind: ServiceAccount
metadata:
  name: resource-locker-nodes
  namespace: resource-locker-operator
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: resource-locker-nodes
rules:
- apiGroups:
  - "*"
  resources:
  - nodes
  verbs:
  - list
  - get
  - watch
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: resource-locker-nodes
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: resource-locker-nodes
subjects:
- kind: ServiceAccount
  name: resource-locker-nodes
  namespace: resource-locker-operator
---
apiVersion: redhatcop.redhat.io/v1alpha1
kind: ResourceLocker
metadata:
  name: test-patch
  namespace: resource-locker-operator
spec:
  serviceAccountRef:
    name: resource-locker-nodes
  patches:
  - targetObjectRef:
      apiVersion: v1
      kind: Node
      name: worker-0
    patchTemplate: |
      metadata:
        labels:
          foo: bar
    patchType: application/strategic-merge-patch+json
    id: node-label

From logs, controller seems to be created, but no additional information:

{"level":"info","ts":1595405911.929446,"logger":"resourcelocker-controller","msg":"Reconciling ResourceLocker","Request.Namespace":"resource-locker-operator","Request.Name":"test-patch"}
{"level":"info","ts":1595405912.8287868,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"controller_patchlocker_node-label","source":"kind source: /v1, Kind=Node"}
{"level":"info","ts":1595405912.8288398,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"controller_patchlocker_node-label"}
{"level":"info","ts":1595405912.8288434,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"controller_patchlocker_node-label","worker count":1}

In the end, as mentioned, patching does not take place, node itself does not have label foo: bar.

raffaelespazzoli commented 4 years ago

ok, I'd like to test what you are doing. can you share the ResourceLocker resource you use, plus any preparation you need to reproduce the issue?

BostjanBozic commented 4 years ago

Hello,

sure, below are steps to reproduce:

This gets operator deployed, and then for trying to patch node with new label, I use following:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: resource-locker-nodes
  namespace: resource-locker-operator
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: resource-locker-nodes
rules:
- apiGroups:
  - "*"
  resources:
  - nodes
  verbs:
  - list
  - get
  - watch
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: resource-locker-nodes
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: resource-locker-nodes
subjects:
- kind: ServiceAccount
  name: resource-locker-nodes
  namespace: resource-locker-operator
---
apiVersion: redhatcop.redhat.io/v1alpha1
kind: ResourceLocker
metadata:
  name: test-patch
  namespace: resource-locker-operator
spec:
  serviceAccountRef:
    name: resource-locker-nodes
  patches:
  - targetObjectRef:
      apiVersion: v1
      kind: Node
      name: worker-0
    patchTemplate: |
      metadata:
        labels:
          foo: bar
    patchType: application/strategic-merge-patch+json
    id: node-label
raffaelespazzoli commented 4 years ago

I'm trying to reproduce the issue, but I don't see the node being patched in the first place... Can you confirm that the patch works for you? Also do you work for red hat? Is there a way to get on a meeting?

On Wed, Jul 22, 2020 at 7:08 AM Bostjan Bozic notifications@github.com wrote:

Hello,

sure, below are steps to reproduce:

  • build new image and push it to repository (make all, make docker-build, make docker-push)
  • adjust deploy/operator.yaml by adjusting image parameter (event issue that is describe in #21 https://github.com/redhat-cop/resource-locker-operator/pull/21 did not occur, even if ClusterRole was not adjusted)
  • `kubectl deploy -f deploy/crds/redhatcop.redhat.io_resourcelockers_crd.yaml
  • `kubectl deploy -f deploy/

This gets operator deployed, and then for trying to patch node with new label, I use following:

apiVersion: v1 kind: ServiceAccount metadata: name: resource-locker-nodes namespace: resource-locker-operator

apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: resource-locker-nodes rules:

  • apiGroups:
    • "*" resources:
    • nodes verbs:
    • list
    • get
    • watch
    • patch

      apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: resource-locker-nodes roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: resource-locker-nodes subjects:

  • kind: ServiceAccount name: resource-locker-nodes namespace: resource-locker-operator

    apiVersion: redhatcop.redhat.io/v1alpha1 kind: ResourceLocker metadata: name: test-patch namespace: resource-locker-operator spec: serviceAccountRef: name: resource-locker-nodes patches:

    • targetObjectRef: apiVersion: v1 kind: Node name: worker-0 patchTemplate: | metadata: labels: foo: bar patchType: application/strategic-merge-patch+json id: node-label

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/redhat-cop/resource-locker-operator/issues/23#issuecomment-662392352, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPERXHESJENY73JTH7APWLR43CDHANCNFSM4PDRPEAA .

-- ciao/bye Raffaele

raffaelespazzoli commented 4 years ago

I verified the issue you are pointing out with a different object (node does not work for me)... I'm going to create a patch for the issue about deleting a resourcelocker object then take a close look at the node use case.

On Wed, Jul 22, 2020 at 7:34 AM raffaele spazzoli < raffaele.spazzoli@gmail.com> wrote:

I'm trying to reproduce the issue, but I don't see the node being patched in the first place... Can you confirm that the patch works for you? Also do you work for red hat? Is there a way to get on a meeting?

On Wed, Jul 22, 2020 at 7:08 AM Bostjan Bozic notifications@github.com wrote:

Hello,

sure, below are steps to reproduce:

  • build new image and push it to repository (make all, make docker-build, make docker-push)
  • adjust deploy/operator.yaml by adjusting image parameter (event issue that is describe in #21 https://github.com/redhat-cop/resource-locker-operator/pull/21 did not occur, even if ClusterRole was not adjusted)
  • `kubectl deploy -f deploy/crds/redhatcop.redhat.io_resourcelockers_crd.yaml
  • `kubectl deploy -f deploy/

This gets operator deployed, and then for trying to patch node with new label, I use following:

apiVersion: v1 kind: ServiceAccount metadata: name: resource-locker-nodes namespace: resource-locker-operator

apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: resource-locker-nodes rules:

  • apiGroups:
    • "*" resources:
    • nodes verbs:
    • list
    • get
    • watch
    • patch

      apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: resource-locker-nodes roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: resource-locker-nodes subjects:

  • kind: ServiceAccount name: resource-locker-nodes namespace: resource-locker-operator

    apiVersion: redhatcop.redhat.io/v1alpha1 kind: ResourceLocker metadata: name: test-patch namespace: resource-locker-operator spec: serviceAccountRef: name: resource-locker-nodes patches:

    • targetObjectRef: apiVersion: v1 kind: Node name: worker-0 patchTemplate: | metadata: labels: foo: bar patchType: application/strategic-merge-patch+json id: node-label

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/redhat-cop/resource-locker-operator/issues/23#issuecomment-662392352, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPERXHESJENY73JTH7APWLR43CDHANCNFSM4PDRPEAA .

-- ciao/bye Raffaele

-- ciao/bye Raffaele

BostjanBozic commented 4 years ago

This is the case - patching does not work even in the first place :) I must admit I did not try it with any other object, as I presumed issue would persist with any object.

No, I am not with RedHat, but we can set up a meeting, just let me know via what and when.

raffaelespazzoli commented 4 years ago

I fixed the issue about the resource being patched after the resource locker is deleted: https://github.com/redhat-cop/resource-locker-operator/pull/24

will take a look at the node issue.

On Wed, Jul 22, 2020 at 7:46 AM Bostjan Bozic notifications@github.com wrote:

This is the case - patching does not work even in the first place :) I must admit I did not try it with any other object, as I presumed issue would persist with any object.

No, I am not with RedHat, but we can set up a meeting, just let me know via what and when.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/redhat-cop/resource-locker-operator/issues/23#issuecomment-662407340, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPERXA7YWPEXCBGQYH3CQLR43GSZANCNFSM4PDRPEAA .

-- ciao/bye Raffaele

BostjanBozic commented 4 years ago

I can confirm fix works (I tested it with serviceAccount). Thank you very much!

Is there any plan to release new version of operator in community operators as well? That way this fix can be accessible straight via OpenShift Marketplace.

raffaelespazzoli commented 4 years ago

I'm working on fixing the issue with the node. this is an edge case, for now the workaround is to put something else in the array of patches with a resource that is not cluster-scoped.

On Wed, Jul 22, 2020 at 8:03 AM raffaele spazzoli < raffaele.spazzoli@gmail.com> wrote:

I fixed the issue about the resource being patched after the resource locker is deleted: https://github.com/redhat-cop/resource-locker-operator/pull/24

will take a look at the node issue.

On Wed, Jul 22, 2020 at 7:46 AM Bostjan Bozic notifications@github.com wrote:

This is the case - patching does not work even in the first place :) I must admit I did not try it with any other object, as I presumed issue would persist with any object.

No, I am not with RedHat, but we can set up a meeting, just let me know via what and when.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/redhat-cop/resource-locker-operator/issues/23#issuecomment-662407340, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPERXA7YWPEXCBGQYH3CQLR43GSZANCNFSM4PDRPEAA .

-- ciao/bye Raffaele

-- ciao/bye Raffaele

raffaelespazzoli commented 4 years ago

we are working to make a new release of this operator in operator hub, no ETA at this moment.

On Wed, Jul 22, 2020 at 8:20 AM Bostjan Bozic notifications@github.com wrote:

I can confirm fix works (I tested it with serviceAccount). Thank you very much!

Is there any plan to release new version of operator in community operators as well? That way this fix can be accessible straight via OpenShift Marketplace.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/redhat-cop/resource-locker-operator/issues/23#issuecomment-662421044, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPERXCLX64D5WS7MNTBS7TR43KRTANCNFSM4PDRPEAA .

-- ciao/bye Raffaele

BostjanBozic commented 4 years ago

Sounds great, thank you for information.