guidance for Mint.WebSocket integration with public Mint API

the-mikedavis commented 2 years ago

Hey y'all :wave:

We (@NFIBrokerage) have been using our mint_web_socket library based on #106 in production for a while now without any issues, and I think it might be stable enough to cut a v1.0.0 release.

Before I do that, though, I'd like to come up with a better solution for keeping HTTP/1 WebSockets open after the request concludes. HTTP/2 WebSockets nicely work out of the box in this regard because they stay open until the stream is closed, but with HTTP/1 when the GET request is finished and the {:done, request_ref} response is emitted, you can no longer Mint.HTTP.send_request_body/3 data on the connection.

In fact @ericmj pointed this out back in #106

For HTTP/2 this should work out of the box. For HTTP/1 we need to change the internals slightly so that we can receive the response while still being able to send the request body and that we don't close the request after receiving {:done, ref}.

The way I'm doing this currently is really hacky: I re-open the request by modifying the Mint.HTTP1 structure. (And actually I realize that new_request/4 function isn't properly attributed, even more reason to fix it! :grimacing:)

What I'm looking for in this issue is some guidance on a good way to accomplish this re-opening without modifying the opaque structures by hand and/or some guidance on how to refactor the Mint internals to support keeping upgraded requests open. I'd be willing to attempt a PR for it, but if y'all would like to make that change yourselves instead I certainly wouldn't mind. And of course any feedback on mint_web_socket would be very welcome :slightly_smiling_face:

ericmj commented 2 years ago

Thanks for opening the discussion on this. I have thought about it and I can't find an API that would work the same for HTTP/1 and HTTP/2.

So I am leaning towards that you have to use the new Mint.HTTP.protocol/1 function and have different implementations depending on the protocol. HTTP/2 would work as you expect while for HTTP/1 you would use Mint.HTTP.get_socket/1 and then use :gen_tcp or :ssl to send and receive the data.

What do you think?

EDIT: Looking at the mint_web_socket API, it would unfortunately mean that the end user has to do the dance described above. What do you think about adding high level API to your library that hides the mint internals so you can hide the dance? I definitely think you should keep the current API as well, the new API would be in addition to the existing one.

the-mikedavis commented 2 years ago

I could imagine accomplishing something like that with some new functions

Mint.WebSocket.stream_request_body/3
Mint.WebSocket.stream/2

which would be drop-in replacements for the Mint.HTTP functions of the same name for WebSocket connections, and would close over the :gen_tcp/:ssl dance.

So you could still open connections and upgrade them the same way:

{:ok, conn} = Mint.HTTP.connect(:https, "echo.websocket.org", 443)
{:ok, conn, ref} = Mint.WebSocket.upgrade(conn, "/", [])
message = receive(do: (message -> message))
{:ok, conn, [{:status, ^ref, status}, {:headers, ^ref, headers}, {:done, ^ref}]} =
  Mint.HTTP.stream(conn, message)
{:ok, conn, websocket} = Mint.WebSocket.new(conn, ref, status, headers)

But sending and receiving frames would use these new functions

# send
frame = {:text, "hello"}
{:ok, websocket, data} = Mint.WebSocket.encode(websocket, frame)
{:ok, conn} = Mint.WebSocket.stream_request_body(conn, ref, data)

# receive
message = receive(do: (message -> message))
{:ok, conn, [{:data, ^ref, data}]} = Mint.WebSocket.stream(conn, message)
{:ok, websocket, [{:text, "hello"}]} = Mint.WebSocket.decode(websocket, data)

For Mint.HTTP.protocol(conn) == :http2, the new functions would delegate to the Mint.HTTP2 functions. For :http1, the functions would read/write from the Mint.HTTP.get_socket(conn) directly with :gen_tcp or :ssl.

(Now that I think about it, is there a way to detect the scheme in the public API [so as to choose to use :get_tcp or :ssl]? It would be easy to re-use the Mint.HTTP1.transport field but I would think that belongs to the private API.)

And then any book-keeping like the request_ref of the WebSocket connection could be stored and retrieved with Mint.HTTP.put_private/3 and get_private/2.

Does that match what you were thinking of with a high-level API that hides Mint internals?

ericmj commented 2 years ago

Now that I think about it, is there a way to detect the scheme in the public API [so as to choose to use :get_tcp or :ssl]

Not that I know of (you could check the socket type, but it's opaque so it's private API), but if you control creating the connection you can see what scheme :http | :https the user passes.

Does that match what you were thinking of with a high-level API that hides Mint internals?

Yes 👍

the-mikedavis commented 2 years ago

Ah ok I think I'm getting my head wrapped around this now. Thanks for the help :)

What would you think about a new function in Mint.HTTP very similar to Mint.HTTP.protocol/1:

defmodule Mint.HTTP do
  @spec scheme(t()) :: :http | :https
end

?

With this I'm just trying to avoid wrapping Mint.HTTP.connect/4 in an equivalent Mint.WebSocket.connect/4.

And the reason I want to avoid wrapping Mint.HTTP.connect/4 is because I think it would muddy the workflow of HTTP/2 multiplexing. For HTTP/1 it seems fine (since the WebSocket takes over the whole connection) but with an HTTP/2 connection, we might be multiplexing a WebSocket connection and also regular HTTP requests (I believe the spec supports this but I haven't found a fully implemented HTTP/2 WebSocket server to test it against). I think it would be a bit confusing to open a connection with some Mint.WebSocket.connect/4 function and then to make HTTP requests on it:

{:ok, conn} = Mint.WebSocket.connect(:https, "echo.websocket.org", 443, protocols: [:http2])
# vs.
{:ok, conn} = Mint.HTTP.connect(:https, "echo.websocket.org", 443)

# upgrade one stream to WebSocket
{:ok, conn, websocket_ref} = Mint.WebSocket.upgrade(conn, "/", [])
message = receive(do: (message -> message))
{:ok, conn, [{:status, ^websocket_ref, status}, {:headers, ^websocket_ref, headers}} =
  Mint.WebSocket.stream(conn, message)
{:ok, conn, websocket} = Mint.WebSocket.new(conn, websocket_ref, status, headers)

# use another for an HTTP request
{:ok, conn, get_ref} = Mint.HTTP.request(conn, "GET", "/favicon.ico", [], nil)
message = receive(do: (message -> message))
{:ok, conn, [{:status, ^get_ref, 200}, {:headers, ^get_ref, _headers}, {:data, ^get_ref, data}, {:done, ^get_ref}]} =
  Mint.WebSocket.stream(conn, message)
  # ^ it's also a bit awkward to use Mint.WebSocket.stream/2 here, but we need to
  # because we don't know if `message` is HTTP or WebSocket

But I suppose there I'm talking less about public/opaque API and more about the awkwardness of mixing Mint.HTTP and Mint.WebSocket functions.

This is probably out of scope for this issue but I'm imagining what the API would look like if Mint.WebSocket were built in to Mint rather than a separate library, or if both Mint.WebSocket and Mint.HTTP were separate libraries from some mint_core dependency. Maybe there's a layer of abstraction that would make the Mint.HTTP vs Mint.WebSocket less awkward by having functions on the Mint module that operate on transports and could generally support application layer protocols?

defmodule Mint do
  @opaque t :: struct() # Mint.HTTP.t() | Mint.WebSocket.t() | other application layer protocols?
  # or maybe this data structure provides a sort of scratch-pad for the opaque structs of the protocols?

  @doc "opens a connection to a remote"
  @spec connect(Mint.Types.scheme(), Mint.Types.address(), :inet.port_number(), keyword()) ::
          {:ok, t()} | {:error, Mint.Types.error()}

  # This would be called by the end user when they receive a message from the transport
  @doc "decodes received messages"
  @spec stream(t(), term()) ::
          {:ok, t(), [Mint.Types.response()]}
          | {:error, t(), Mint.Types.error(), [Mint.Types.response()]}
          | :unknown

  # This would be called by `Mint.HTTP.stream_request_body/3` and the equivalent function in `Mint.WebSocket`.
  # And it needs a better name.
  @doc "encodes and sends messages"
  @spec stream_data(
          t(),
          Mint.Types.request_ref(),
          iodata() | :eof | {:eof, trailing_data :: term()}
        ) ::
          {:ok, t(), [Mint.Types.response()]}
          | {:error, t(), Mint.Types.error(), [Mint.Types.response()]}

  @doc "closes a connection"
  @spec close(t()) :: {:ok, t()}
end

Where stream/2 and stream_data/3 would call some callback on the protocol's module to do the encoding and decoding work, and I imagine this hypothetical mint_core would have the transport modules as helpers. With this you could build any application layer protocol on that common interface.

So with those Mint functions, the above example becomes...

something like: ```elixir {:ok, conn} = Mint.connect(:https, "echo.websocket.org", 443, protocols: [Mint.HTTP2]) # upgrade one stream to WebSocket {:ok, conn, websocket_ref} = Mint.WebSocket.upgrade(conn, "/", []) message = receive(do: (message -> message)) {:ok, conn, [{:status, ^websocket_ref, status}, {:headers, ^websocket_ref, headers}} = Mint.stream(conn, message) {:ok, conn, websocket} = Mint.WebSocket.new(conn, websocket_ref, status, headers) # use another for an HTTP request {:ok, conn, get_ref} = Mint.HTTP.request(conn, "GET", "/favicon.ico", [], nil) message = receive(do: (message -> message)) {:ok, conn, [{:status, ^get_ref, 200}, {:headers, ^get_ref, _headers}, {:data, ^get_ref, data}, {:done, ^get_ref}]} = Mint.stream(conn, message) ``` Not a huge difference but I think it reads a bit better.

On the other hand, HTTP/2 WebSockets are very rare and I don't think there's much demand for other application layer protocols, so it's probably not worth going to great lengths to make this less awkward.

What do you think?

ericmj commented 2 years ago

What would you think about a new function in Mint.HTTP very similar to Mint.HTTP.protocol/1

I am generally hesitant to adding functions that return data from the struct that you already passed into the API. Otherwise we should also add functions that return the host, port, connection options, and so on. To compare with the protocol/1 function it makes more sense to have because the user doesn't know the protocol since it was Mint that negotiated it. Instead I think the scheme should be passed into Mint.WebSocket.new by the user.

OTOH adding a scheme/1 would be very cheap for Mint while it gives websockets a better API, so the pragmatic choice would be to add the function.

So I am unsure what the best solution is :/.

the-mikedavis commented 2 years ago

Ah good idea, I hadn't thought about passing it in Mint.WebSocket.new. That might be a nice approach because, being WebSocket specific, it'd be a good place to specify the scheme as :ws | :wss which could be less confusing.

Though I think I'm generally a fan of "getter functions" for an opaque data type like scheme/1, port/1, etc. because without them a Mint user is forced to wrap the conn in an extra struct/map/tuple with those fields or store those fields with put_private/get_private if they want to access that data later, and that data isn't necessarily an implementation detail. That's not super cumbersome to do but there is a little friction there. If only you could mark a subset of fields within a struct as opaque!

I'm gonna close this out and fix up Mint.WebSocket with passing the scheme in new, using protocol/1 and put_private/1, etc.. If you want to continue the discussion or have extra thoughts, feel free to re-open or ping me. Thanks for your help! :slightly_smiling_face:

the-mikedavis commented 2 years ago

Ok last thing I swear!

What do you think about a Mint.HTTP.get_buffer/1 function? Some servers may pack in WebSocket frames in the same :tcp/:ssl message as the HTTP 101 response, so any WebSocket frames in that packet get stuck in conn.buffer.

ericmj commented 2 years ago

Another option is to dump the buffer in a {:data, ref, buffer} response if there is remaining data on the buffer after we parsed the full HTTP responses from the server. I don't know if this may cause other issues though.

the-mikedavis commented 2 years ago

I like how it currently works with the :none body (where it's basically ignored because there isn't supposed to be a body)

https://github.com/elixir-mint/mint/blob/bc8e14ccc64a9a49c2facdf12069cb98c3f31736/lib/mint/http1.ex#L638-L643

Where the body gets set as :none because of the 1xx status code

https://github.com/elixir-mint/mint/blob/bc8e14ccc64a9a49c2facdf12069cb98c3f31736/lib/mint/http1.ex#L874-L875

Maybe it makes sense to treat 101 as a special case and emit a {:data, ref, buffer} response after/before the {:done, ref}? Alternatively I could imagine some new response tuple used just for switching protocols that hands off all the data you'd need (the socket, protocol, remaining buffer, etc.)

ericmj commented 2 years ago

I like the idea of limiting the {:data, ref, buffer} response to 101 status codes.

So to summarize: we should return {:data, ref, buffer} with the contents of the buffer, before {:done, ref}, but only for 101 status codes. Let us know if you would like to work on this.

the-mikedavis commented 2 years ago

Sounds good! I'm on the case :+1:

elixir-mint / mint

guidance for Mint.WebSocket integration with public Mint API #333