vi / websocat

Command-line client for WebSockets, like netcat (or curl) for ws:// with advanced socat-like functions
MIT License
6.73k stars 259 forks source link

last packet not forwarded #129

Open hevmaad opened 2 years ago

hevmaad commented 2 years ago

I'm trying to create a websocket tunnel for a binary request-response TCP protocol. I'm using websocat version 1.8.0 I used websocat on the client side to listen on localhost and connect to the remote WebSocket endpoint

$ websocat_amd64-linux-static -E --binary tcp-l:127.0.0.1:2345 ws://10.2.138.210:3333/

Sometime it seems that the last packet of a request is not sent to the destination through the websocket. The client blocks waiting for the response. When I kill the client the last packet is sent.

You can reproduce the issue with the following setup on 2 linux machines. On the first machine, the receiver (IP 10.2.138.210), put a service to consume a request, reply, and consume another request.

$ cat /dev/urandom | head -c 1000000 > datablock
$ mkfifo /tmp/f
$ while true; do cat /tmp/f | (head -c 400000 >/dev/null; head -c 4000 datablock; head -c 400000 >/dev/null) \
| ./websocat_amd64-linux-static --oneshot -E --binary ws-l:0.0.0.0:3333 asyncstdio: > /tmp/f; echo OK; done

On the second machine, the sender, launch websocat to listen on localhost. $ ./websocat_amd64-linux-static -E --binary tcp-l:127.0.0.1:2345 ws://10.2.138.210:3333/

On another terminal run the client to follow the request-response protocol

$ cat /dev/urandom | head -c 1000000 > datablock
$ mkfifo /tmp/g
$ cat /tmp/g | (head -c 400000 datablock; head -c 4000 >/dev/null; head -c 400000 datablock) \
| nc localhost 2345 > /tmp/g

With this setup, the client blocks 50% of the time. When you kill the client the service outputs another OK. The problem is not evident using only localhost connections.

vi commented 2 years ago

Haven't checked the snippets yet, but does anything change if you remove -E and/or add -n?

hevmaad commented 2 years ago

Removing the -E issues the warning serving multiple clients without --exit-on-eof (-E) or with -U option is prone to socket leak in this websocat version

but it still blocks, with or without -n. The same holds if adding -n to the original options

vi commented 2 years ago

Tried running (but not yet understanding) the snippets locally, but got no OK at all.

Note that loop { websocat ws-l:... ... } scheme is inherently unreliable and may miss the connections coming at unfortunate moments.

Can the scheme be rewritten using exec: or cmd: instead (assuming the real task is more complicated than a sample stated in the question)?

vi commented 2 years ago

Found mistake in my adaptation of the snippets, now it's running locally and I get OK all the time.

Tried looping it and got at least 1000 of OKs.

Which nc flavour do you use? Traditional or openbsd? What if you use websocat -b asyncstdio: tcp:127.0.0.1:2345 instead of nc?

hevmaad commented 2 years ago

Which nc flavour do you use?

I use nc with openbsd flavour, version 1.187-1ubuntu0.1 on "Linux Mint 19.2".

What if you use websocat -b asyncstdio: tcp:127.0.0.1:2345 instead of nc?

Even after using websocat the outcome is the same.

I want to stress that the problem is only evident using two different machines: on my localhost setup a 1000 loop run without problems

vi commented 2 years ago

Maybe it would also be reproducible on one system if use network namespaces + veth + netem to simulate a nonideal network.

vi commented 2 years ago

Using this network quality: tc qdisc add dev veth1 root netem delay 50ms 20ms loss 3% slot distribution pareto 10ms 100ms bytes 5000 I often get deadlocks, although sometimes it slowly gets though. Sometimes it succeeds multiple times in a row.

vi commented 2 years ago

Started the same experiment again, but somehow it stopped being reproducible (the counter counted up to 112484 when I stopped it).

Is there some simpler test case that show it's specifically Websocat's problem it locks up sometimes?

hevmaad commented 2 years ago

Currently I can't find a simpler test case that shows the problem.

However, with my setup, I can infer that the problem it's related to websocket protocol handling inside Websocat. If I change the transport from websocket to tcp:

$ ./websocat_amd64-linux-static -E --binary tcp-l:127.0.0.1:2345 tcp:10.2.138.210:3333
$ cat /tmp/f | (head -c 400000 >/dev/null; head -c 4000 datablock; head -c 400000 >/dev/null) \
| ./websocat_amd64-linux-static --oneshot -E --binary tcp-l:0.0.0.0:3333 asyncstdio: > /tmp/f

the problem doesn't arise.