envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.86k stars 4.78k forks source link

More descriptive load balancing failure messages #14912

Open snowp opened 3 years ago

snowp commented 3 years ago

Currently we assume that if a load balancer implementation returns nullptr that this means no healthy upstream. For load balancers like the subset lb, it could also mean that no host existed that satisfied the requirement (i.e. no subset found). It would be helpful to provide some mechanism to surface the failure reason, which could be included in some way in the local reply.

This would in turn allow creating local reply mappings based on it, making it possible to have Envoy respond with errors such as no subset found for x-subset-header: <value> or similar.

The simplest way of doing this would probably be to update the chooseHost signature to also return an optional error string, allowing the lb to include additional information when relevant.

ggreenway commented 3 years ago

That sounds useful. It may be better to make it an enum instead of a string, in case we want to do something like add an access-log flag for some values.

mpuncel commented 3 years ago

This would be really useful. Today "no healthy upstream" can mean a variety of things:

1) All the upstreams were failing health checks (if panic mode is disabled) 2) There were no upstreams at all (e.g. service discovery issue) 3) subset selector did not match any upstreams

There might be more I'm not thinking of, but if we can distinguish those cases it would save a lot of debugging time!

snowp commented 3 years ago

The main rationale for a string was better supporting custom clusters/load balancers that might want to enrich the information, but perhaps it would be better with an enum capturing the main failure reasons and potentially exposing mutable filter state/metadata to the lb down the line if we actually have a use case for extensions wanting to record more free form failure reasons.