GoogleCloudPlatform / metacontroller

Lightweight Kubernetes controllers as a service
https://metacontroller.app/
Apache License 2.0
792 stars 111 forks source link

Have a separate resync-after-error period flag #195

Open glasser opened 4 years ago

glasser commented 4 years ago

As described in #194, sometimes when rolling out a new controller in a legacy cluster you may expect to temporarily have errors like

failed to sync service-per-pod "apps/v1:StatefulSet:staging:kafka": can't reconcile children for StatefulSet staging/kafka: [services "kafka-1" already exists, services "kafka-2" already exists, services "kafka-3" already exists, services "kafka-0" already exists]

which you can fix by deleting the existing resources (or doing hacky reparenting like described in #194).

However, if you're trying to fix this by deleting the existing resource, you might want resyncing to happen faster, so that the deleted resource gets replaced quickly. But you don't necessarily want to set all resyncs to happen fast.

Thus, it would be helpful if you could configure metacontroller with a smaller resync period after an error reconciling children than after success.

AmitKumarDas commented 4 years ago

@glasser I have commented in #194 Let's see if that helps & solves this issue as well