magic-wormhole / magic-wormhole-transit-relay

Transit Relay server for Magic-Wormhole
MIT License
169 stars 42 forks source link

Relay gets Unhandled Error on WebSocket Tranist #27

Open donpui opened 2 years ago

donpui commented 2 years ago

Transit relay gets unhandled error while using WebSocket when Receiver cancels transfer during transferring. Sender hangs and doesn't receive disconnect from relay server.

Steps to replicate with wormhole-william client:

  1. Sender starts sending file: wormhole-william send ../tests/4.2GB --no-listen --transit-helper "ws://localhost:4002"
  2. Receiver enters code: wormhole-william receive --no-listen --transit-helper "ws://localhost:4002"
  3. Receiver accepts file
  4. Receiver terminates program (^C)
  5. Sender hangs on sending process

Issue is only, when using WebSocket. Works fine with TCP relay (sender gets "broken pipe")

Error from Transit relay log: 2022-08-26T08:04:51+0000 [WebSocketTransitConnection,22,172.17.0.1] Unhandled Error Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/twisted/python/log.py", line 96, in callWithLogger return callWithContext({"system": lp}, func, *args, **kw) File "/usr/local/lib/python3.9/site-packages/twisted/python/log.py", line 80, in callWithContext return context.call({ILogContext: newCtx}, func, *args, **kw) File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext return self.currentContext().callWithContext(ctx, func, *args, **kw) File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext return func(*args, **kw) --- <exception caught here> --- File "/usr/local/lib/python3.9/site-packages/twisted/internet/posixbase.py", line 683, in _doReadOrWrite why = selectable.doRead() File "/usr/local/lib/python3.9/site-packages/twisted/internet/tcp.py", line 248, in doRead return self._dataReceived(data) File "/usr/local/lib/python3.9/site-packages/twisted/internet/tcp.py", line 253, in _dataReceived rval = self.protocol.dataReceived(data) File "/usr/local/lib/python3.9/site-packages/autobahn/twisted/websocket.py", line 348, in dataReceived self._dataReceived(data) File "/usr/local/lib/python3.9/site-packages/autobahn/websocket/protocol.py", line 1243, in _dataReceived self.consumeData() File "/usr/local/lib/python3.9/site-packages/autobahn/websocket/protocol.py", line 1255, in consumeData while self.processData() and self.state != WebSocketProtocol.STATE_CLOSED: File "/usr/local/lib/python3.9/site-packages/autobahn/websocket/protocol.py", line 1619, in processData fr = self.onFrameEnd() File "/usr/local/lib/python3.9/site-packages/autobahn/websocket/protocol.py", line 1747, in onFrameEnd self._onMessageEnd() File "/usr/local/lib/python3.9/site-packages/autobahn/twisted/websocket.py", line 384, in _onMessageEnd self.onMessageEnd() File "/usr/local/lib/python3.9/site-packages/autobahn/websocket/protocol.py", line 647, in onMessageEnd self._onMessage(payload, self.message_is_binary) File "/usr/local/lib/python3.9/site-packages/autobahn/twisted/websocket.py", line 387, in _onMessage self.onMessage(payload, isBinary) File "/usr/local/lib/python3.9/site-packages/wormhole_transit_relay/transit_server.py", line 258, in onMessage self._state.got_bytes(payload) File "/usr/local/lib/python3.9/site-packages/automat/_methodical.py", line 244, in got_bytes value = output(oself, *a, **k) File "/usr/local/lib/python3.9/site-packages/automat/_methodical.py", line 283, in __call__ return self.method(oself, *args, **kwargs) File "/usr/local/lib/python3.9/site-packages/wormhole_transit_relay/server_state.py", line 289, in _send_to_partner self._buddy._client.send(data) File "/usr/local/lib/python3.9/site-packages/wormhole_transit_relay/transit_server.py", line 188, in send self.sendMessage(data, isBinary=True) File "/usr/local/lib/python3.9/site-packages/autobahn/websocket/protocol.py", line 2275, in sendMessage raise Disconnected("Attempt to send on a closed protocol") autobahn.exception.Disconnected: Attempt to send on a closed protocol

meejah commented 2 years ago

Attempting to repeat with a plain Autobahn python client where the receiver does a .loseConnection() on the first message it receives causes the sending side to fail to send its next message (with "attempt to send on closed socket"). I don't see the above traceback in the server.

This is using the ws_client.py Autobahn client in this repository (change the if False to get the above disconnect behavior).

I believe what's happening is that the Go code is ignoring the close message completely and still sending. The server should probably still handle and log the above traceback, but it seems from the Autobahn test that the transit relay is propagating the close correctly (same as the TCP case).

Further, the transit could probably nuke the TCP connection entirely if one side ignores the close (effectively masking the client-side bug).

The TCP and WebSocket cases in the transit use the very same state-machine code so it's hard to see how there can be different behavior in the server. In the WebSocket case, it will send a CLOSE message with code 1000; this is what I believe the Go implementation is ignoring.

meejah commented 2 years ago

@donpui can you confirm that Wormhole William / Go websocket changes handle the real underlying issue here?

The Transit can still improve the logging for the above case. It could also be more aggressive on clients like this one that don't correctly respond to WebSocket CLOSE messages.

donpui commented 2 years ago

Problem was on Wormhole William / Go side while using WebSocket. We have fixed Go part. Sender now gets error: failed to write msg: failed to write frame: WebSocket closed: failed to read frame header: EOF

However transit-relay still logs trace: 2022-09-06T21:16:49+0300 [WebSocketTransitConnection,4,127.0.0.1] Unhandled Error Traceback (most recent call last): ... autobahn.exception.Disconnected: Attempt to send on a closed protocol

Do we improve logging and this specific error handling on the Transit?