erlang / otp

Erlang/OTP
http://erlang.org
Apache License 2.0
11.41k stars 2.95k forks source link

ERL-1112: DTLS socket unable to receive on Kubernetes node scale-up #4051

Closed OTP-Maintainer closed 3 years ago

OTP-Maintainer commented 4 years ago

Original reporter: JIRAUSER12907 Affected version: OTP-22.1 Component: ssl Migrated from: https://bugs.erlang.org/browse/ERL-1112


This issue has been observed on OTP 22 (erts-10.5.6). The underlying platform is Azure Kubernetes Service (AKS) version 1.13.11 with VM nodes running Ubuntu 16.04.6 LTS.

A Docker container is started running image {{elixir:1.9.4-alpine}} and a simple Erlang DTLS server:

{code}
:ssl.start()
opts = [
  protocol: :dtls, active: true, mode: :binary,
  versions: [:'dtlsv1.2'], verify: :verify_none, fail_if_no_peer_cert: false,
  cacertfile: "/cert/ssl.crt", keyfile: "/cert/ssl.key", certfile: "/cert/ssl.crt"
]
{:ok, listen_socket} = :ssl.listen(49002, opts)
{:ok, hsocket} = :ssl.transport_accept(listen_socket, 10_000)
{:ok, socket} = :ssl.handshake(hsocket, 10_000)
{code}

The server is deployed to Kubernetes and exposed with a Kubernetes service as follows:
{code}
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: dtls-server
  name: dtls-server
spec:
  selector:
    matchLabels:
      app: dtls-server
  template:
    metadata:
      labels:
        app: dtls-server
    spec:
      containers:
      - image: elixir:1.9.4-alpine
        name: dtls-server
        command: ["top"]
---
apiVersion: v1
kind: Service
metadata:
  name: dtls-server
spec:
  type: LoadBalancer
  ports:
  - name: s-server-dtls
    port: 49001
    protocol: UDP
  - name: erlang-dtls
    port: 49002
    protocol: UDP
  selector:
    app: dtls-server
{code}

From my local machine (running Arch Linux with OpenSSL version 1.1.1d  10 Sep 2019), I can connect to the server using {{openssl s_client}}:
{code}
openssl s_client -dtls1_2 ip:49002
{code}

Using {{s_client}}, I am able to send and receive data.

The AKS cluster is then scaled up, that is, an extra node is added to the cluster:
{code}
az aks nodepool scale -g rg --cluster-name cluster --name agentpool --node-count 2
{code}
After some time, {{kubectl get events}} shows the node being added:
{noformat}
15m         Normal    Starting                  Node         Starting kubelet.
15m         Normal    NodeHasSufficientPID      Node         Node aks-agentpool-90776725-vmss000001 status is now: NodeHasSufficientPID
15m         Normal    NodeAllocatableEnforced   Node         Updated Node Allocatable limit across pods
15m         Normal    NodeHasNoDiskPressure     Node         Node aks-agentpool-90776725-vmss000001 status is now: NodeHasNoDiskPressure
15m         Normal    NodeHasSufficientMemory   Node         Node aks-agentpool-90776725-vmss000001 status is now: NodeHasSufficientMemory
15m         Normal    NodeReady                 Node         Node aks-agentpool-90776725-vmss000001 status is now: NodeReady
15m         Normal    RegisteredNode            Node         Node aks-agentpool-90776725-vmss000001 event: Registered Node aks-agentpool-90776725-vmss000001 in Controller
15m         Normal    Starting                  Node         Starting kube-proxy.
14m         Normal    UpdatedLoadBalancer       Service      Updated load balancer with new hosts
{noformat}

At some point during this scale operation, the Erlang DTLS server can no longer receive any packets sent from the client.
{code}
iex(7)> {:ok, socket} = :ssl.handshake(hsocket, 20_000)
{:ok,
 {:sslsocket,
  {:gen_udp, {#PID<0.138.0>, {{{10, 240, 0, 5}, 41519}, #Port<0.6>}},
   :dtls_connection}, [#PID<0.140.0>]}}
iex(8)> flush
{:ssl,
 {:sslsocket,
  {:gen_udp, {#PID<0.138.0>, {{{10, 240, 0, 5}, 41519}, #Port<0.6>}},
   :dtls_connection}, [#PID<0.140.0>]}, "abc\n"}
:ok
iex(9)> flush
:ok
{code}

However, the server is still able to send messages to the client.

At the same time, an {{openssl s_sever}} was run in the same container and it maintained the ability to send and receive throughout the scaling.
OTP-Maintainer commented 4 years ago

JIRAUSER12907 said:

After some more investigation, we got a little bit further. We took a packet capture on the VM where the Erlang DTLS and OpenSSL server run, and we did some tracing on {{dtls_packet_demux}} and the process and port of the DTLS socket. The issue doesn't reproduce every time and we have seen some variations of the problem, but in this particular scenario, the Erlang DTLS server is eventually neither able to send nor receive, while the OpenSSL {{s_server}} keeps working.

It looks like Kubernetes changed the source IP of the client, which is visible in the capture (the port of the Erlang DTLS server is 49002). The source IP changed from {{10.240.0.5}} to {{10.244.0.1}}.

!2019-12-12-153121_1748x779_scrot.png|width=1104,height=492!

In the trace, the packet can be seen coming ind, but it is dropped in {{handle_datagram/3}} since the IP and source port cannot be found in the set of clients in the {{dtls_packet_demux}} state.

{code}
13:25:27.891966    {:trace, #Port<0.6>, :send, {:udp, #Port<0.6>, {10, 244, 0, 1}, 55764, <<23, 254, 253, 0, 1, 0, 0, 0, 0, 0, 11, 0, 28, 55, 62, 95, 167, 248, 9, 18, 86, 218, 4, 44, 135, 148, 147, 109, 0, 76, 24, 57, 138, 65, 212, 70, 70, 191, 128, 95, 66>>}, #PID<0.135.0>}
13:25:27.896477    {:trace, #PID<0.135.0>, :receive, {:udp, #Port<0.6>, {10, 244, 0, 1}, 55764, <<23, 254, 253, 0, 1, 0, 0, 0, 0, 0, 11, 0, 28, 55, 62, 95, 167, 248, 9, 18, 86, 218, 4, 44, 135, 148, 147, 109, 0, 76, 24, 57, 138, 65, 212, 70, 70, 191, 128, 95, 66>>}}
13:25:27.896632    {:trace, #PID<0.135.0>, :call, {:dtls_packet_demux, :handle_info, [{:udp, #Port<0.6>, {10, 244, 0, 1}, 55764, <<23, 254, 253, 0, 1, 0, 0, 0, 0, 0, 11, 0, 28, 55, 62, 95, 167, 248, 9, 18, 86, 218, 4, 44, 135, 148, 147, 109, 0, 76, 24, 57, 138, 65, 212, 70, 70, ...>>}, {:state, 49002, #Port<0.6>, {:gen_udp, :udp, :udp_closed, :udp_error}, {:ssl_options, :dtls, [{254, 253}], :verify_none, {#Function<8.45162026/3 in :ssl.handle_verify_options/2>, []}, #Function<9.45162026/1 in :ssl.handle_verify_options/2>, false, false, :undefined, 1, "/cert/ssl.crt", :undefined, "/cert/ssl.key", :undefined, [], :undefined, "/cert/ssl.crt", :undefined, :undefined, :undefined, :undefined, :undefined, [<<192, 44>>, <<192, 48>>, <<192, 36>>, <<192, 40>>, <<192, 46>>, <<192, 50>>, <<192, 38>>, <<192, 42>>, <<0, 159>>, <<0, 163>>, <<0, 107>>, <<0, ...>>, <<...>>, ...], #Function<4.45162026/4 in :ssl.handle_reuse_session_option/3>, true, 268435456, true, true, :infinity, false, :undefined, :undefined, :undefined, :undefined, true, :undefined, ...}, {:socket_options, :binary, 0, 0, 0, false}, {1, {{{10, 240, 0, 5}, 55764}, {[#PID<0.141.0>], []}, nil, nil}}, {1, {{{10, 240, 0, 5}, 55764}, nil, nil}}, {1, {#PID<0.141.0>, {{10, 240, 0, 5}, 55764}, nil, nil}}, {[], []}, false, false}]}, {:gen_server, :try_dispatch, 4}}
{code}
OTP-Maintainer commented 4 years ago

ingela said:

Humm ... so I guess there must be a way to detect that the client changes its source  IP but still is considered to be the same  virtual connection ?! Any insights are welcome, currently working mostly with TLS-1.3
OTP-Maintainer commented 4 years ago

ingela said:

I was thinking maybe there is some kind of timing problem where the old connection is not quite closed when the client tries to start a new connection ... as far as I can remember there is no way to have the client being able to change its IP.
OTP-Maintainer commented 4 years ago

ingela said:

I just fixed a DTLS "listen socket" emulation in ERL-1118  and I am curios if  it could improve this issue too! See PR 2504

https://github.com/erlang/otp/pull/2504
OTP-Maintainer commented 4 years ago

JIRAUSER12907 said:

After some further investigation, I don't really think this should be fixed at OTP level. It seems like this is expected behavior of Kubernetes, which is not really compatible with DTLS.

From https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-loadbalancer :

bq. As of Kubernetes 1.5, packets sent to Services with Type=LoadBalancer are source NAT’d by default, because all schedulable Kubernetes nodes in the Ready state are eligible for loadbalanced traffic. So if packets arrive at a node without an endpoint, the system proxies it to a node with an endpoint, replacing the source IP on the packet with the IP of the node (as described in the previous section).

I doubt the fix to ERL-1118 will make a difference, but we appreciate the fix.
OTP-Maintainer commented 4 years ago

ingela said:

So this particular issue seems not to be an issue with the OTP implementation so I will close this for now. If you find some problem related you are of course welcome to reopen or create a new issue which ever seems most appropriate.