hashicorp / terraform-aws-consul-ecs

Consul Service Mesh on AWS ECS (Elastic Container Service)
https://www.consul.io/docs/ecs
Mozilla Public License 2.0
52 stars 30 forks source link

Improve bootstrap time of v0.7.x mesh-task #257

Closed v-rosa closed 9 months ago

v-rosa commented 10 months ago

I've noticied that bootstrap times of consul side-cars in v0.7.x are taking more than I expected, let's say 1m40s-2m00 with very simple services which need ~10secs to bootstrap.

I've started playing the with the health check configurations of consul-dataplane and consul-ecs-control-plane.

See https://github.com/v-rosa/terraform-aws-consul-ecs/commit/e196aadb510291c12be9aa9fbba3047e57db562d

I've managed to reduce consul bootstrap to ~50-60 seconds.

Do you think these health check values are adequate? This was tested with a service which have only 1 upstream and 1 down stream. Do the number of configured up/downstreams affect heavily the bootstrap time?

I also understand the that the v0.8.0 is being cooked which will change the side cars structure. Do you think is valuable to hotfix v0.7.x? Personally at my company we won't know when we'll be able to jump to v0.8.0 and meanwhile these bootstrap times are kinda unconfortable.

Ganeshrockz commented 10 months ago

👋 @v-rosa The limits you have added should be sufficient most of the times.

Do the number of configured up/downstreams affect heavily the bootstrap time?

I don't think it should but I will double check on this part.

Do you think is valuable to hotfix v0.7.x?

I will bring this up within my team and get back to you on the decision.

Personally at my company we won't know when we'll be able to jump to v0.8.0 and meanwhile these bootstrap times are kinda unconfortable.

Any specific reason for the hesitation to jump to 0.8.0?

v-rosa commented 10 months ago

Hey @Ganeshrockz Thanks for the feedback!

Regarding this:

Any specific reason for the hesitation to jump to 0.8.0?

It's more about the time available to test this new release and it's rollout up to production vs the business need to start using api-gateway.

Ganeshrockz commented 10 months ago

@v-rosa We generally do patch releases with an interval of 5 to 6 weeks. The last one happened before the year end holidays. We'll make sure to add this fix in the upcoming patch release cadence.

Ganeshrockz commented 9 months ago

@v-rosa https://github.com/hashicorp/terraform-aws-consul-ecs/pull/267 should fix this and will be released early next week

Ganeshrockz commented 9 months ago

This should be fixed with v0.7.2. Closing this issue