cloudfoundry / diego-release

BOSH Release for Diego
Apache License 2.0
201 stars 210 forks source link

[Envoy] Envoy proxy healthchecks #922

Open klapkov opened 3 months ago

klapkov commented 3 months ago

Envoy proxy healthchecks

Summary

In the past we have observed cases, where an application is running, but does not accept any connections. When we looked into it, the app healthcheck was passing and the envoy proxy was running as well, but no requests were reaching the app. This leads to this loop:

This is why we started to look into potential ways to do some sort of healthchecking on the proxy. The best option we currently see is modifying the app healthcheck in a way that also checks the proxy. Currently it uses only the app port. We can add a parallel check that also does the same trough the proxy port. The proxy will then redirect the request to the app and we will receive a response. This of course means two times more healthchecking requests to the app, but this should not have any significant impact.

Of course this extra check functionality could be enabled with a flag in the executor, so it can be used only if needed.

Please let me know what you think on the topic. I think this topic has been discussed in the past and maybe someone could give some context why it was never implemented.

Diego repo

https://github.com/cloudfoundry/executor https://github.com/cloudfoundry/healthcheck