loxilb-io / kube-loxilb

Implementation of kubernetes service load-balancer spec for loxilb
Apache License 2.0
85 stars 16 forks source link

How to support M3UA point to point and SCTP multi-homing at the same time? #49

Closed sadate closed 12 months ago

sadate commented 1 year ago

According to RFC4666 section-1.3.2.4:

[1.3.2.4] Support for the Management of SCTP Associations between the SGP and ASPs
.......................................................................
  In order to avoid redundant SCTP associations between two M3UA peers, one side
  (client) SHOULD be designated to establish the SCTP association, or M3UA 
  configuration information maintained to detect redundant associations 
  (e.g., via knowledge of the expected local and remote SCTP endpoint addresses).
......................................................................

There can be only one association between two M3UA peers. This is the so-called point-to-point feature of M3UA, which is derived from the concept of the traditional circuit domain. Q1

We know that the only default mode (0 - Normal NAT) that can meet this condition in kube-loxillb. Maybe it’s due to the fact that the Normal NAT mode does not perform SNAT on the incoming direction and does not perform DNAT on the outgoing direction. But the Normal NAT mode does not support the SCTP multi-homing.

Q2

Although the Full NAT mode supports SCTP multi-homing, since the source IP address has been replaced by the instance VIP, when the endpoint on the right actively initiates a connection to the server on the left as a client, loxilb cannot know how to replace the instance VIP(such as 123.123.123.1) with the real server IP(such as 10.10.10.1). Therefore, loxilb's Full NAT mode cannot support M3UA's biniary-direction point-to-point communication like the Normal NAT mode. Q3

Now it seems that we cannot have our cake and eat it too, so how do we get out of this predicament? My description may not be clear enough, I hope you can understand it.

nik-netlox commented 1 year ago

Hi @sadate, Are you having this issue with fullnat mode with uni-homing or fullnat mode with multi-homing?

sadate commented 1 year ago

@nik-netlox
In Full NAT mode, if the left side of the picture above acts as client and the right side acts as server, there will be no problem with the function.

The problem arises when the roles on the left and right in the above picture are reversed. When the left acts as server and the right acts as client, the problem arises.

Investigating the reason, I think it may be that when the client on the left (10.10.10.1) initiates a request to the server on the right (192.168.1.1), it will initiate a request to the Service VIP (80.80.80.1), the loxilb remembers the mapping between client IP (10.10.10.1) and instance VIP (123.123.123.1) relationship through SNAT, when the server on the right (192.168.1.1) replies to the request of the client (10.10.10.1), the loxilb can correctly translate the source IP address 123.123.123.1 to the client's IP address 10.10.10.1 through SNAT.

When the left and right roles are interchanged, when the client on the right (192.168.1.1) actively initiates a request to the server on the left, it will initiate a request to the instance VIP (123.123.123.1), but loxilb has no way to know the instance VIP (123.123.123.1) mapping relationship with server (10.10.10.1) on the left. The reason why similar problems do not exist in Normal NAT mode is because it does not perform DNAT on the outbound.

In other words, the load balancer (loxilb) has a direction. By default, the right side acts as a backend server. It has the IP information of all backend endpoints. But once the roles are reversed, load balancer (loxilb) has no way to learn any IP information of the server on the left.

Therefore, the loxilb's Full NAT mode cannot support M3UA's biniary-direction point-to-point communication like the Normal NAT mode. Maybe it’s due to the fact that the Normal NAT mode does not perform SNAT on the incoming direction and does not perform DNAT on the outgoing direction.

So our request is whether we can support SCTP multi-homing in Normal NAT mode.