anmonteiro / httpun-ws

Other
28 stars 14 forks source link

Infinite loop on Websocket.Wsd.close with close code #61

Closed aantron closed 1 week ago

aantron commented 9 months ago

I haven't fully diagnosed this yet, and I'm happy to continue to do so, but I'd like to open it in case you can eyeball the issue.

Dream eventually calls Websocketaf.Wsd.close ~code:(`Other code) socket in response to the client closing the WebSocket. As I understand it, it's normal to attempt to send a close code from the server in an attempt to perform the WebSocket close handshake.

This seems to trigger an infinite loop in WebSocket. Omitting ~code:(`Other code) causes the server to close the WebSocket successfully without getting stuck in an infinite loop.

This wasn't the behavior as of 2021. Looking at the blame for this code

https://github.com/anmonteiro/websocketaf/blob/28abb768916f606287a2eb05a5d6a4aa11ec31b6/lib/wsd.ml#L110-L125

...specifically, the Some code case, I wonder if you can spot whether any of these commits could be triggering this infinite loop behavior:

If not, I can proceed with direct debugging, or git bisect (which requires me to rebase the renaming commits or unvendor websocket/af into a Dune workspace).

aantron commented 9 months ago

See aantron/dream#230 and aantron/dream#222.

anmonteiro commented 1 month ago

I'd love to fix this is it's still an issue. do you have a minimal repro I can try?

copy commented 1 week ago

@anmonteiro I can reproduce this (or a very similar issue) using wscat.exe on a server that immediately closes connections (ssh reverse port forwarding with no local end, on Linux):

dune build examples/eio/wscat.exe
rsync -az _build/default/examples/eio/wscat.exe some-other-host:
ssh -t -R 12345:localhost:12345 some-other-host ./wscat.exe localhost -p 12345
# ssh outputs (as expected): connect_to localhost port 12345: failed.
# wscat.exe uses 100% on the server

Happens both on current master and 0.1.

anmonteiro commented 1 week ago

@copy thanks for the instructions, I’m a bit busy and haven’t tried the repro yet, but can you think of a way to repro that would have a faster feedback loop?

Perhaps writing a small TCP server that always closes accepted connections?

copy commented 1 week ago

Sure:

let () =
  Eio_linux.run (fun env ->
      Eio.Switch.run (fun sw ->
          let socket = Eio.Net.listen ~backlog:1024 ~sw env#net (`Tcp (Eio.Net.Ipaddr.V4.any, 12345)) in
          let (stream, from) = Eio.Net.accept ~sw socket in
          Eio.traceln "got connection";
          Eio.Flow.close stream
        )
    )
anmonteiro commented 1 week ago

@copy thanks for your repro. this ended up being the result of a dumb mistake which #73 fixes