Open friism opened 1 year ago
I'm excited to see this on the roadmap!
I'm especially interested in gradual rollout and canary deploys. This sort of thing has been on my Heroku wishlist for a long time.
I'd also like to suggest considering an "adaptive preboot" for example using Rails recent addition https://github.com/rails/rails/pull/46936
If an app had a standard endpoint that could return 200 OK when everything is booted, we could make the zero downtime deploy via preboot much quicker, instead of waiting for a static 3 minutes. This could also be leveraged for auto-rollback, as in, don't switch over to the new code unless the health/heartbeat/up endpoint responds 200 OK.
Also worth mentioning is that I'd like to see a an option added to rollback which would bypass the preboot delay for emergency use.
Thanks!
re: Gradual Rollout.
I'd be happy just to see preboot
get the boot, and instead see Common Runtime have a rolling restart like Private Spaces does. A cherry on top would be the ability to configure the percentage of the roll - it's hard-coded to 25% on Dogwood, IIRC. But in a large enough formation, it'd be nice to tune that down even further.
I'd be excited to start with a limited version of this: a healthcheck endpoint + auto-rollback. There are cases where we have pushed code changes that caused our Rails application to fail to boot. A simple GET to a healthcheck endpoint would have returned a 500. I would love if heroku would make such a request to our new dynos during preboot, then halt the rest of the deploy if it can't get a 200 response.
For canary deployments, having something gradual that would be controlled by the error rate of each release would be great.
Required Terms
What service(s) is this request for?
runtime
Tell us about what you're trying to solve. What challenges are you facing?
We should improve how code changes (releases) are rolled out with Heroku. We should consider adding:
$PORT
. This is too simplistic since the app may not actually be ready to serve traffic by thenFor non-web dynos, we should also establish a healthcheck convention and support rolling deploys (currently non-web dynos don't support any form of gradual rollout)