facebookincubator / katran

A high performance layer 4 load balancer
GNU General Public License v2.0
4.75k stars 504 forks source link

QUIC support in Katran #186

Closed korsaris closed 1 year ago

korsaris commented 1 year ago

I have a question about QUIC support in Katran.

I see the F_QUIC_VIP flag, but, if I understand correctly, this flag just enables QUIC statistics calculation for the VIP and enables sending packets for the known CIDs, located in server_id_map (manually added, if I'm correctly understand), to the corresponding Reals.

But what about other QUIC connetions (external to Katran) and their CIDs?

pdubovitsky commented 1 year ago

Katran would continue using consistent hash routing for unknown CIDs.

korsaris commented 1 year ago

Only UDP related info is taken into consideration, while QUIC info is ignored in this case? What if client's src port is changed? Will it be treated like a new connection?

pdubovitsky commented 1 year ago

Only UDP related info is taken into consideration, while QUIC info is ignored in this case? What if client's src port is changed? Will it be treated like a new connection?

It would be treated as a new connection for unknown CIDs.

korsaris commented 1 year ago

May be I didn't express myself clear. Let's assume that we have a client, which is trying to connect to QUIC VIP behind Katran. On the first packet Katran doesn't have any info about CID, it chooses one of the reals. At this point I have several questions:

  1. Will Katran store this CID in any map, like LRU for common TCP/UDP based VIP?
  2. Will Katran use this CID for other packets balancing (sticky to real, as LRU)?
  3. What would be if clients source port is changed, but CID is not?
  4. If chosen real becomes unavailable, how will entry in map be updated?
pdubovitsky commented 1 year ago

May be I didn't express myself clear. Let's assume that we have a client, which is trying to connect to QUIC VIP behind Katran. On the first packet Katran doesn't have any info about CID, it chooses one of the reals. At this point I have several questions:

  1. Will Katran store this CID in any map, like LRU for common TCP/UDP based VIP?
  2. Will Katran use this CID for other packets balancing (sticky to real, as LRU)?
  3. What would be if clients source port is changed, but CID is not?
  4. If chosen real becomes unavailable, how will entry in map be updated?

We are discussing "unknown" CIDs, meaning that these are not listed in the server_id_map.

  1. The Katran would not store the unknown CID in any map.
  2. The Katran would not use this CID for other packets balancing.
  3. Since consistent hash would be used for the routing of the unknown CID, a port change would result, most likely, in routing the packet to a different backend. It works as designed.
  4. Since the unknown CID is not stored in any map, there is no update issue.

If I understand it correctly, you would like the Katran to "learn" incoming CIDs to use them for the subsequent routing decisions.

First of all, a connection ID that is used by the Katran is the destination connection ID, and it should be used for consistent routing to the backend, but it is the backend that sets it on the connection origination. The first packet sent to the server would not have CID, and it would be forwarded based on the consistent hash only.

Now, assume that the Katran learned the CID, and it would be using it for the subsequent routing decisions. There is no guarantee that subsequent packets would be delivered to the same Katran instance. Usually, specific Katran instance would be selected based on ECMP/UCMP routing protocol and current network topology. If we would like to maintain consistent routing decisions based on the CID, the CID maps would have to be kept consistent across all Katran instances. So, we would have to have a service that propagates the updates and maintains concistent map for each Katran instance.

The server_id_map is the map that consistently updated by the external service. The only difference is that the CIDs are not learned by the Katran. A service that ensures that the CIDs are unique across all backends would be able to generate the map for all load balancers. It is more efficient than to learn CIDs on the fly.

I hope it helps.