refraction-networking / water

WebAssembly Transport Executables Runtime
Apache License 2.0
27 stars 1 forks source link

bug: TCP connection halting on async read on FIN(EOF) #8

Closed gaukas closed 10 months ago

gaukas commented 1 year ago

Note: This is believed to be a wasi-related problem. However, until it is resolved, user should pay extra care when dealing with it.

Problem

For some reason when a TCP Connection is closed by the remote by a FIN packet, even tho the Go will be able to detect that and return EOF on the next Read(), the async read (by tokio) in WebAssembly may not be able to correctly handle that. In rare cases, the non-blocking async read becomes blocking and does not return EOF immediately, and meanwhile preventing other async operations from making progress, thus halting the whole system.

Additional Context

This bug is not a runtime-side bug, but it is non-trivial to fix or work around inside the WebAssembly environment. Therefore from the runtime-side we work around it by using a unix socket to relay between the network socket and the WebAssembly environment. When the runtime detects that the network socket is closed, it will close the unix socket, which will immediately unblock the async read in WebAssembly.

Solutions

There's not yet a clear fix to this problem.

We introduced connhaltbug build tag in #7, which patch this issue with the workaround mentioned above. A formal patch will require further investigation.

gaukas commented 10 months ago

Great news: with the pure Go implementation, this problem is gone as we no longer need Rust ffi nor duplicated file descriptor for the network socket(s).