grpc / grpc-go

The Go language implementation of gRPC. HTTP/2 based RPC
https://grpc.io
Apache License 2.0
20.2k stars 4.25k forks source link

If a priority contains multiple localities with pick_first, load is reported incorrectly #7339

Open dfawley opened 1 week ago

dfawley commented 1 week ago

This should resolve when dual stack is fully implemented (#6472). Until then, a partial implementation should be possible without requiring much throwaway work.

That should be all we need to correct this issue.

dfawley commented 1 week ago

@apolcyn noticed that if we simply changed the locality ID to use the first address instead of the last, we'd most likely improve the accuracy (since the first address is more likely to be connected to than the last):

http://google3/third_party/golang/grpc/xds/internal/balancer/clusterimpl/clusterimpl.go;l=364;rcl=634425291

neilw4 commented 5 days ago

Please do not change the locality ID to always use the first address. Reported load needs to be 100% accurate, especially in cases where we're failing over to different backends.

dfawley commented 4 days ago

The above comment was suggested as a short-term workaround that would take a few minutes to accomplish while we await the full fix.

dfawley commented 3 days ago

Another idea for a short-term fix that will actually get us to 100% correctness but not take as long to implement (hopefully):

  1. Add an unexported field to balancer.SubConnState that, when the ConnectivityState is READY, contains the connected address.
  2. Add accessor function vars for this field in internal which the balancer package assigns to local functions that perform the actions.
  3. Set this in the channel around here, indirectly.
  4. Read this in the StateListener (here) and call updateLocalityID() based on the connected address's locality.
  5. Add a test to verify.

This works better than adding an unexported method to the SubConn, because the subconn wrapping requires propagating the method everywhere.