Open weyfonk opened 1 month ago
(copied from #2609)
When failing to correct drift on a resource (eg. modified ports array on a service), Fleet would leave a GitRepo
in Modified
state, with no error on the corresponding bundle deployment status.
GitRepo
statusforce: true
on the GitRepo
resolves the error by deleting and recreating the Helm release for the bundle deployment, hence recreating the service in this case.See reproduction steps above, in the issue description.
GitRepo
with drift correction enabled (but not set to force mode) pointing to rancher/fleet-test-data
's multiple-paths
GitRepo
and bundle deploymentGitRepo
drift correction mode to true
GitRepo
and bundle deployment status error disappear, once the service had been recreated.GitRepo
to set its correctDrift.force
option to true eventually updates the bundle status, in that the bundle will no longer appear as modified.GitRepo
status is reflected in the Rancher UIN/A
I am still observing this in Rancher v2.9-2d10b66bb2e1e5fc7568591ba41648002cf29b20-head
with fleet:104.1.0+up0.10.4-rc.1
following reproduction steps on original ticket.
@weyfonk, am I perhaps not looking at something well or is it still this fix not being propagated into the above-mentioned fleet version?
Thanks @mmartin24 for raising this.
I am still able to reproduce issues with updating service ports on Rancher v2.9.3-alpha4
with Fleet v0.10.4-rc.1
.
After a few seconds, although the bundle deployment containing the service appears as modified, the corresponding bundle sees its status updated to Ready
, as if the bundle deployment were ready too. This in turn is reflected in the GitRepo
owning that bundle.
This happens because the resources fields are cleared from the bundle deployment's status. Why that happens is still unclear, although this code is a prime suspect.
Confirmed: this line calls a DryRun
on a Wrangler apply.Apply
, which returns an empty set of objects.
In turn, that set is used to populate resources in the bundle deployment status, which explains why those resources don't appear in the status from that point onwards.
Fixing this would require either:
nonModified
field is set to true
. I thought that this might be caused by a conflict between the Fleet controller and agent over bundle deployment status updates, but this has been fixed in v0.10 as well.Tested in v2.9-1d1065cd5bf09c23834720420e1153712fa43439-head
with fleet:104.1.1+up0.10.5-rc.2
and still errorring. Same issue as in https://github.com/rancher/fleet/issues/2609#issuecomment-2454409377.
Setting back to backlog
This is a backport of #2609 to v0.10.