containers / libkrun

A dynamic library providing Virtualization-based process isolation capabilities
Apache License 2.0
901 stars 74 forks source link

WARN devices::virtio::vsock::muxer] stream: unhandled op=3 #116

Closed blenessy closed 1 year ago

blenessy commented 1 year ago

This is regarding the same scenario as in #112 .

When running libkrun with warnings enabled you can see these lines frequently:

WARN devices::virtio::vsock::muxer] stream: unhandled op=3

When this warning is shown we are also leaking a Proxy object bound to a listening socket (see proof in subsequent comment)

op=3 is VSOCK_OP_RST. This means that the TcpProxy in tcp.rs does not get updated with VSOCK reset events.

Please advice - if trivial I'll gladly fix and test.

blenessy commented 1 year ago

This is actually not just cosmetical - we are leaking Proxy objects every time this symptom is found in log. Fortunately I found a real easy way to reproduce the problem without running avalanchego (scenarion of #112):

Please follow comments inline and note that it does not matter if the nc 127.0.0.1 1234 is executed in TEE Guest or TEE Host we will leak a socket in the TEE Guest:

# try connecting to localhost 1234 - nothing is listening so we get a TCP RST from the Kernel 
root@localhost:/# nc 127.0.0.1 1234
[2023-02-12T12:49:04Z WARN  devices::virtio::vsock::muxer_thread] deferring proxy removal: 15567375737655657068
[2023-02-12T12:49:04Z WARN  devices::virtio::vsock::muxer] deferring proxy removal: 15567375737655657068
root@localhost:/# [2023-02-12T12:49:09Z WARN  devices::virtio::vsock::reaper] removing proxy: 15567375737655657068

# after ~5 seconds the socket is correctly removed by libkrun reaper - and we have no more proxies.
[2023-02-12T12:49:09Z WARN  devices::virtio::vsock::reaper] remainig proxies: 0

# we open up a listening socket on 127.0.0.1 1234
root@localhost:/# nc -lp 1234

# CTRL+C is pressed and the listening socket/proxy is scheduled for close with the reaper
^C[2023-02-12T12:49:29Z WARN  devices::virtio::vsock::muxer] deferring proxy removal: 15567375746245591660

# BEFORE the reaper closes the listening socket above we try connection to 127.0.0.1 1234
root@localhost:/# nc 127.0.0.1 1234

# we see the symptom as TCP RST is not handled properly
[2023-02-12T12:49:32Z WARN  devices::virtio::vsock::muxer] stream: unhandled op=3

# reaper wants to remove listening proxy
[2023-02-12T12:49:35Z WARN  devices::virtio::vsock::reaper] removing proxy: 15567375746245591660

# reaper fails to remove the listening proxy as two proxies remain
[2023-02-12T12:49:35Z WARN  devices::virtio::vsock::reaper] remainig proxies: 2

# CTRL+C to interrupt: nc 127.0.0.1 1234 - reaper schedules removal of data proxy
^C[2023-02-12T12:49:56Z WARN  devices::virtio::vsock::muxer] deferring proxy removal: 15567375754835526252
[2023-02-12T12:49:56Z WARN  devices::virtio::vsock::muxer_thread] deferring proxy removal: 15567375754835526252
root@localhost:/# [2023-02-12T12:50:01Z WARN  devices::virtio::vsock::reaper] removing proxy: 15567375754835526252

# data proxy is removed - the listening proxy remains open
[2023-02-12T12:50:01Z WARN  devices::virtio::vsock::reaper] remainig proxies: 1
slp commented 1 year ago

Thanks for the issue and the producer. I don't remember why I decided to not implement OP_RST originally (perhaps wasn't technically possible to get a relevant RST before we implemented deferred removal?), but it should be there. Otherwise, reverse proxies (those created when receiving a connection request to an active listening socket) will leak if the actual endpoint (the userspace application) is no longer listening.

blenessy commented 1 year ago

The patch works nicely @slp . Thanks for fixing it.