aptible / dashboard.aptible.com

DEPRECATED - Ember.js dashboard for the Aptible PaaS
https://dashboard.aptible.com
MIT License
81 stars 35 forks source link

reloadUntilOperationStatusChanged: Backoff slower #773

Closed krallin closed 7 years ago

krallin commented 7 years ago

Our backoff is a little fast in reloadUntilOperationStatusChanged, and usually results in quickly having to wait several minutes for the next reload to occur.

During my testing (https://github.com/aptible/aptible-integration/pull/109), this turned out to be pretty disruptive. When provisioning a DB, we'd often end up having to wait for several minutes after the DB was deprovisioned before it showed up as provisioned.

There are two changes to our polling interval here:

For 10 minutes, here is the new polling "schedule" (these are the durations in second we wait between each poll, rounded to the nearest integer - the actual implementation is a little more precise):

[4, 4, 5, 6, 8, 9, 11, 14, 17, 20, 24, 29, 35, 42, 51, 61, 73, 88, 106]

For comparison, here was the old one (but: see the caveat below, in practice it'd probably wait once more after 1024 seconds):

[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

Unfortunately, it looks like the current implementation was tying the retry interval and the overall timeout together, so simply using a smaller backoff factor would have had the (presumably undesirable) side effect of causing reloadUntilOperationStatusChanged to wait a lot longer.

So, this patch uses another approach for reloadUntilOperationStatusChanged, which unties the retry interval and the timeout. This does mean that we'll timeout once the timeout set when calling reloadUntilOperationStatusChanged is met. I'm assuming this is desirable, but let me know if not.

This also fixes the call we were making to reloadUntilOperationStatusChanged when scaling a service: it turns out we were passing the operation when a timeout was expected. THis surprisingly didn't break anything, but most likely resulted in the timeout being ignored altogether.

I've tested this manually for now, since I'm merely changing the internals of this function, and I know we have tests for its observed behavior (off the top of my head, I believe we have some in the ACME specs).


cc @sandersonet @gib @fancyremarker @blakepettersson

krallin commented 7 years ago

had to adjust the ACME specs a little bit; confirms we do have indirect test coverage on this

blakepettersson commented 7 years ago

👍