Closed jeffvance closed 5 years ago
Eventually recovers and the k8s resources are created. There seems to be a lot of overhead creating and then deleting aws and k8s resources in this code path...
That's true. If a reconcile iteration can't succeed, it cleans up any resources that were created. I still think that's a necessary step. There are also smaller internal loops that will retry k8s resource operations on error a few times in quick succession in case of errors that may not be fatal.
What you're seeing is Reconcile() being retried by the controller at an exponential backoff. Conceptually, do we want to do as k8s does and continuously retry until actual state of the world == desired state of the world? If not, then we need a way to differentiate errors returned by provisioners that doesn't have us parsing strings. Maybe typed errors?
@screeley44 @jeffvance Has the bug resurfaced or can we close this?
Have not seen this for a while...
E0416 14:30:26.636238 27193 resourcehandlers.go:285] "msg"="possibly intermittent, retrying" "error"="Operation cannot be fulfilled on objectbucketclaims.objectbucket.io \"screeley-provb-3\": the object has been modified; please apply your changes to the latest version and try again" "request"="s3-provisioner/screeley-provb-3"
Full log:
Eventually recovers and the k8s resources are created. There seems to be a lot of overhead creating and then deleting aws and k8s resources in this code path...