Closed lindhe closed 1 week ago
Probably related to https://github.com/rancher/fleet/issues/2551
I'm seeing the same behavior with rancher 2.8.5 / fleet 0.9.8.
I've reproduced the problem with rancher 2.9.1 / 0.10.1 as well:
The fleet-agent logs on the cluster are the same messages repeated over and over again. E.g.:
{"level":"info","ts":"2024-09-04T22:15:40Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"c6f34409-8c39-4bfa-bb86-ff79d3028f46"}
{"level":"info","ts":"2024-09-04T22:15:41Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1033","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:42Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:42Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"ae91a7c9-9d4e-4ed4-988a-eeb61c5d015d"}
{"level":"info","ts":"2024-09-04T22:15:43Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","reconcileID":"c1678532-73f5-4929-9985-e50db5603133","deploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","release":"rook-ceph/rook-ceph-cluster:2","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055"}
{"level":"info","ts":"2024-09-04T22:15:43Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1034","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:44Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:44Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"1407fd06-6c0a-4e3a-98a8-baf4bff370da"}
{"level":"info","ts":"2024-09-04T22:15:45Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-cluster","reconcileID":"494aa1f3-8e09-4790-ac4a-7acc6bc34b7f","deploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055","release":"rook-ceph/rook-ceph-cluster:2","appliedDeploymentID":"s-1c47fdd307de7cd53771b1bcf05e2d5bf014de495953153ac73876102439a:096621fc89c2b75c31148d0b150ad30027e7035383a85288a66c67d0980fb055"}
{"level":"info","ts":"2024-09-04T22:15:45Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1035","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:46Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:46Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"3f0b30ce-a868-48e0-95aa-1ce063f78d48"}
{"level":"info","ts":"2024-09-04T22:15:47Z","logger":"bundledeployment.DeployBundle","msg":"Deployed bundle","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b","deploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d","release":"rook-ceph/rook-ceph-conf:1036","appliedDeploymentID":"s-f4e58f4e8d63737718ef1c935b3bcd8054daea4ce155c1be7f814b4738481:704080e856144689404f7488b30ba700c635fea680a58494dd909c76b226262d"}
{"level":"info","ts":"2024-09-04T22:15:48Z","logger":"bundledeployment.UpdateStatus","msg":"Status not ready","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b","error":"cephnfs.ceph.rook.io rook-ceph/auxtel modified {\"spec\":{\"server\":{\"resources\":{\"limits\":{\"cpu\":\"3\"}}}}}"}
{"level":"info","ts":"2024-09-04T22:15:48Z","logger":"bundledeployment.RemoveExternalChanges","msg":"Drift correction: rollback","controller":"bundledeployment","controllerGroup":"fleet.cattle.io","controllerKind":"BundleDeployment","BundleDeployment":{"name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb"},"namespace":"cluster-it-5628-ruka-plus-plus-ruka-67e02fc747cb","name":"ruka-fleet-s-dev-c-ruka-rook-ceph-conf","reconcileID":"9cd5479d-4754-4c7b-909f-3acde128815b"}
I have tried, and failed, to reproduce this against the current main
by deleting a label on a config map. This needs further investigation.
Could you share an example of a workload (GitRepo), or a known manifest or chart, which triggers this failure?
In any case, #2917 should reduce the noise compared to logs shared above.
Cleaning up the backlog.
Is there an existing issue for this?
Current Behavior
If an object is changed, Fleet detects the diff but does nothing to converge to a healthy state.
Expected Behavior
When
spec.correctDrift.enabled=true
, I expect Fleet to try and apply changes as soon as there is a diff.Steps To Reproduce
spec.correctDrift.enabled=true
in the bundle.kubectl edit
, e.g. delete a label or something.Observe new state "Modified" for the resource:
Environment
Logs
No response
Anything else?
It looks like https://github.com/rancher/fleet/pull/1594 tried to implement drift correction, but it's clearly not working.