crossplane / upjet

A code generation framework and runtime for Crossplane providers
Apache License 2.0
271 stars 78 forks source link

Allow users to configure timeouts on a per-resource basis #421

Open mbbush opened 4 days ago

mbbush commented 4 days ago

What problem are you facing?

In https://github.com/crossplane-contrib/provider-upjet-aws/issues/1346, a crossplane user reported timeout errors restoring a large RDS database from a snapshot, which takes more than an hour. In this particular case (restoring from a db snapshot) creating the resource requires much longer than simply creating a new, empty, resource, so the default timeout from the terraform provider (40 minutes) is insufficient.

How could Upjet help solve your problem?

While upjet already allows provider authors to set timeouts in the resource config, and defaults to the terraform provider's timeouts if those are unset, it would be helpful to allow a third layer, which could be set by the user on a per-resource basis, to indicate "This resource is different than a typical one of its Kind, and needs a different timeout". It would also allow for fast self-service workarounds when users report timeout issues, without requiring them to wait for a code change to the provider.

I think an annotation seems like a natural place to specify this, as it seems like the sort of thing that would almost always be omitted, but useful to be able to specify when needed.

As an initial proposal, I'd suggest

upjet.crossplane.io/create-timeout
upjet.crossplane.io/read-timeout
upjet.crossplane.io/update-timeout
upjet.crossplane.io/delete-timeout

as annotation keys, with any string that can be parsed to a golang duration as the value.

@blakeromano suggested at the SIG-Upjet meeting that he might prefer a spec field similar to managementPolicies. I'll let him speak for himself here if he wants to describe that more fully.

blakeromano commented 4 days ago

My thought in terms of implementation was to have something like:

---
apiVersion: whatever/v1beta1
...
spec:
  timeouts:
    create: ""
    read: ""
    update: ""
    delete: ""

I think either annotations or defining it in the spec is fine I just think it depends on how long term we see the need for this ability in upjet. I think if we are long term wanting this functionality then we should plan for it in the spec.