Deployment times out after 600 seconds, despite setting a higher healthcheck timeout

mwolf-net commented 2 years ago

Hi,

We are setting the parameter healthcheck.timeout=1200 in our boxfuse configuration file, and we can see in the output that it has indeed picked up the parameter: healthcheck.timeout -> 1200.

But then still after only 10 minutes we get the following message: 08:55:22.784 WARNING: Run failed: Time out: 1 / 2 instances of myusername/example_ALB_name:5050 failed to come up within 600 seconds => check the instance logs and ensure the healthcheck configuration (healthcheck.port, healthcheck.path, healthcheck.timeout) is correct

It appears that it is using its default timeout instead of the configured one?

mwolf-net commented 2 years ago

This is with CloudCaptain client v.1.35.2.1525

axelfontaine commented 2 years ago

You are right. It appears we only used this parameter for the ALB instance reachability check, but not for the provisioning of the Auto-Scaling Group itself. This has now been fixed.

Please confirm this is now working for you.

mwolf-net commented 2 years ago

Hi Axel, thanks for the quick response!

It appears that the reason why the provisioning of the autoscaling group took so long in the first place, is because of a temporary issue on Amazon's side. So we can't reproduce this on demand. But since you confirm that the issue was there in the code and you fixed it, I trust the timeout will be applied correctly if it happens again. We'll keep an eye on it!

cloudcaptainsh / cloudcaptain

Deployment times out after 600 seconds, despite setting a higher healthcheck timeout #271