RamenDR / ramen

Apache License 2.0
72 stars 52 forks source link

Improve envtest execution time #1370

Closed ShyamsundarR closed 3 months ago

ShyamsundarR commented 3 months ago

Envtests eat up time in their Consistently and Eventually clauses

This is set to 10s (in most cases), because we want to ensure that a few reconciles (or at least > 1) are complete for a Consistently clause. Similarly for an Eventually clause more than one reconcile is complete to ensure desired state is reached.

NOTE: Eventually will take time only on failures, as otherwise once a reconcile runs it would exit with success if the conditions are met. Consistently on the other hand will delay succesful tests always.

The timeout set is to address the exponential backoffs for retries, such that if a reconcile is backed off by more than a few seconds we still have enough time for a reconcile to run and cause the test to pass as needed.

This delay is fixed by this commit to ensure reconciles do not exponentially backoff for all reconcilers, and that we can set up the rate limiter for reconciles to a more standard tick.

As the delay is static (recocile every 100ms), the timeouts for the Eventually and Consistently clauses can be reduced (in this commit to 1 second).

This improves overall run time of envtests from about ~210s to about ~110s (this will differ based on the machine used to run tests)

ShyamsundarR commented 3 months ago

Executed about 50 runs locally, failures seen are the usual:

1289

(The failure below has 2 forms, a GET failure or an UPDATE failure, and needs to be retried)

VolSync_Handler Ensure PVC from ReplicationDestination When ReplicationDestination exists with snapshot latestImage When the latest image volume snapshot exists When pvc to be restored has already been created [It] ensure PVC should not fail
.../ramen/controllers/volsync/vshandler_test.go:1329
...
  [FAILED] Expected success, but got an error:
      <*errors.StatusError | 0xc00139e820>: 
      Operation cannot be fulfilled on persistentvolumeclaims "testpvc1": the object has been modified; please apply your changes to the latest version and try again
      {
          ErrStatus: {
              TypeMeta: {Kind: "", APIVersion: ""},
              ListMeta: {
                  SelfLink: "",
                  ResourceVersion: "",
                  Continue: "",
                  RemainingItemCount: nil,
              },
              Status: "Failure",
              Message: "Operation cannot be fulfilled on persistentvolumeclaims \"testpvc1\": the object has been modified; please apply your changes to the latest version and try again",
              Reason: "Conflict",
              Details: {
                  Name: "testpvc1",
                  Group: "",
                  Kind: "persistentvolumeclaims",
                  UID: "",
                  Causes: nil,
                  RetryAfterSeconds: 0,
              },
              Code: 409,
          },
      }
  In [It] at: .../ramen/controllers/volsync/vshandler_test.go:1332 @ 05/03/24 11:57:16.394