rancher / fleet

Deploy workloads from Git to large fleets of Kubernetes clusters
https://fleet.rancher.io/
Apache License 2.0
1.5k stars 226 forks source link

Drift correction not working #2436

Open lindhe opened 4 months ago

lindhe commented 4 months ago

Is there an existing issue for this?

Current Behavior

If an object is changed, Fleet detects the diff but does nothing to converge to a healthy state.

Expected Behavior

When spec.correctDrift.enabled=true, I expect Fleet to try and apply changes as soon as there is a diff.

Steps To Reproduce

  1. Have Rancher v2.8.1 installed.
  2. In Rancher, click "Continuous Delivery" and "Git Repos" and select the "fleet-local" workspace.
  3. Add a GitRepo that applies some resource. Make sure to check "Enable Self-Healing" to set spec.correctDrift.enabled=true in the bundle.
  4. Wait for the GitRepo to sync and become healthy, with the new resource created and in state "Ready".
  5. Edit the resource using kubectl edit, e.g. delete a label or something.
  6. Observe new state "Modified" for the resource:

    Screenshot 2024-05-16 171752

Environment

- Architecture: amd64
- Fleet Version: The one that's bundled with Rancher v2.8.1. 
- Cluster:
  - Provider: RKE2
  - Options: 3 nodes upstream cluster
  - Kubernetes Version: 1.27.9

Logs

No response

Anything else?

It looks like https://github.com/rancher/fleet/pull/1594 tried to implement drift correction, but it's clearly not working.

manno commented 2 months ago

Probably related to https://github.com/rancher/fleet/issues/2551

jhoblitt commented 3 weeks ago

I'm seeing the same behavior with rancher 2.8.5 / fleet 0.9.8.

image

jhoblitt commented 3 weeks ago

I've reproduced the problem with rancher 2.9.1 / 0.10.1 as well:

image

The fleet-agent logs on the cluster are the same messages repeated over and over again. E.g.:

{"level":"info","ts":"2024-09-04T22:15:40Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"c6f34409-8c39-4bfa-bb86-ff79d3028f46"}
{"level":"info","ts":"2024-09-04T22:15:41Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1033","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:42Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:42Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d"}
{"level":"info","ts":"2024-09-04T22:15:43Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","reconcileID":"c1678532-73f5-4929-9985-e50db5603133","deploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","release":"rook-ceph/rook-ceph-cluster:2","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055"}
{"level":"info","ts":"2024-09-04T22:15:43Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1034","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:44Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:44Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da"}
{"level":"info","ts":"2024-09-04T22:15:45Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","reconcileID":"494aa1f3-8e09-4790-ac4a-7acc6bc34b7f","deploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","release":"rook-ceph/rook-ceph-cluster:2","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055"}
{"level":"info","ts":"2024-09-04T22:15:45Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1035","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:46Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:46Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48"}
{"level":"info","ts":"2024-09-04T22:15:47Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1036","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:48Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:48Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b"}