vbmithr / ocaml-websocket

Websocket library for OCaml
ISC License
162 stars 44 forks source link

Corrupted frame content + workaround #58

Open mjambon opened 8 years ago

mjambon commented 8 years ago

The content field of a Frame gets corrupted with random data after sending the frame. The workaround is to discard the content string after use or to make a copy before sending. I haven't looked into the problem close enough to determine the source of the problem but here's how to reproduce it:

(*
   File ws_content.ml

   Build with:

     ocamlfind opt -o ws_content \
       ws_content.ml \
       -package websocket.lwt -linkpkg

   Same, in safe-string mode. Requires String.copy or equivalent instead
   of Bytes.copy:

     ocamlfind opt -o ws_content \
       ws_content.ml \
       -package websocket.lwt -linkpkg -safe-string

   Run with:

     ./ws_content
*)

open Lwt

let connect () =
  let uri = Uri.of_string "https://echo.websocket.org" in
  Resolver_lwt.resolve_uri ~uri Resolver_lwt_unix.system >>= fun endp ->
  let ctx = Conduit_lwt_unix.default_ctx in
  Conduit_lwt_unix.endp_to_client ~ctx endp >>= fun client ->
  Websocket_lwt.with_connection ~ctx client uri

let push send content =
  send (Websocket_lwt.Frame.create ~content ())

let push_workaround send content =
  send (Websocket_lwt.Frame.create ~content:(Bytes.copy content) ())

let main () =
  connect () >>= fun (recv, send) ->
  let content = "hello" in
  Printf.printf "before: %S\n%!" content; (* prints "hello" *)
  push send content >>= fun () ->
  Printf.printf "after: %S\n%!" content; (* prints gibberish *)
  exit 0

let () = Lwt_main.run (main ())

The output is something like:

$ ./ws_content 
before: "hello"
after: "\148\157\194\024\147"

while it should be:

$ ./ws_content 
before: "hello"
after: "hello"

The workaround works well enough for our purposes, but it would be good to do something to save time to others. Some suggestions, from easier to better:

OCaml version: 4.02.3 on amd64

vbmithr commented 8 years ago

Ha I think this is because I mask the frame in-place before sending it to the client, as the websocket protocol requires. The right thing to do is indeed probably to make a copy of the content and not mask-in-place to avoid the issue you encountered. So option 2 :)

copy commented 7 years ago

I think for performance reasons the current behaviour should be kept or be enabled with a parameter. That's also how high-performance WebSocket libraries like uws do it.

zoggy commented 7 years ago

Since I upgraded to 2.7 today, I have a (maybe) related problem when accessing the content of received frames. I have no simple repro case but here is what I do:

  let handler conn_client =
        let req = Websocket_lwt.Connected_client.http_request conn_client in
        let recv () = Websocket_lwt.Connected_client.recv conn_client in
        let send = Websocket_lwt.Connected_client.send conn_client in
        let stream = Websocket_lwt.mk_frame_stream recv in
        <handle stream of frames>
  in
  let (waiter, stopper) = Lwt.wait () in
      let thread = Websocket_lwt.establish_standard_server
        ~stop: waiter ~ctx ~mode:server handler
      in
      Lwt.return { stopper ; thread }

When handling the stream of frames, I ignore the frames with opcode <> Websocket_lwt.Frame.Opcode.Text but the content of Text frames seems to be binary data.

bcc32 commented 6 years ago

In that case, perhaps there should be two interfaces, one that uses bytes and is free to trash the buffer when it's done, and one that uses string, so that this unexpected behavior is clearer?