elixir-mint / mint

Functional HTTP client for Elixir with support for HTTP/1 and HTTP/2 🌱
Apache License 2.0
1.36k stars 106 forks source link

Passive mode HTTP/2 error in Mint v1.5 #396

Closed bismark closed 1 year ago

bismark commented 1 year ago

We are still investigating the root cause so will add more details here when we can but wanted to open this ticket in case anyone else has hit the same issue.

We experienced an issue today after upgrading from 1.4.2 to 1.5.1 where passive mode HTTP2 connections will receive a %Mint.HTTPError{reason: {:server_closed_connection, :frame_size_error, ""}, module: Mint.HTTP2} upon the first Mint.HTTP.recv call.

After making two Mint.HTTP.recv(conn, 0, 5000) requests, we then started receiving expected responses.

Our current guess is it's related to the new :handshaking state introduced in 1.5.

bismark commented 1 year ago

Appears that streaming the body is significant for triggering the issue.

Here is a simple example of a code that worked under 1.4 but fails under 1.5:

Mix.install([
  {:mint, "~> 1.4.0"},
  #{:mint, "~> 1.5.0"},
  {:castore, "~> 0.1.0"}
])

host = <set-me>

{:ok, conn} = Mint.HTTP.connect(:https, host, 443, mode: :passive)
{:ok, conn, ref} = Mint.HTTP.request(conn, "POST", "/", [], :stream)

{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, "{}")
Mint.HTTP.recv(conn, 0, 5000)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, :eof)
Mint.HTTP.recv(conn, 0, 5000)

{:ok, _conn, [{:status, _, _} |_]} = Mint.HTTP.recv(conn, 0, 5000)
whatyouhide commented 1 year ago

@bismark I tried your example above with Mint 1.4.0 and using host = "google.com", and it also fails.

** (MatchError) no match of right hand side value: {:ok, %Mint.HTTP2{transport: Mint.Core.Transport.SSL, socket: {:sslsocket, {:gen_tcp, #Port<0.13>, :tls_connection, :undefined}, [#PID<0.521.0>, #PID<0.520.0>]}, mode: :passive, hostname: "google.com", port: 443, scheme: "https", state: :open, buffer: "", window_size: 1048574, encode_table: %HPAX.Table{max_table_size: 4096, entries: [], size: 0, length: 0}, decode_table: %HPAX.Table{max_table_size: 4096, entries: [], size: 0, length: 0}, ping_queue: {[], []}, client_settings_queue: {[[]], []}, next_stream_id: 5, streams: %{3 => %{id: 3, received_first_headers?: false, ref: #Reference<0.1519724292.3099066370.52979>, state: :half_closed_local, window_size: 1048574}}, open_client_stream_count: 1, open_server_stream_count: 0, ref_to_stream_id: %{#Reference<0.1519724292.3099066370.52979> => 3}, server_settings: %{enable_connect_protocol: false, enable_push: true, initial_window_size: 1048576, max_concurrent_streams: 100, max_frame_size: 16384, max_header_list_size: 65536}, client_settings: %{enable_push: true, max_concurrent_streams: 100, max_frame_size: 16384}, headers_being_processed: nil, proxy_headers: [], private: %{}}, []}

The example above is wrong in all versions though, because you cannot ignore the result of Mint.HTTP.recv/3. The conn that it returns must be kept around, since we're working with immutable data. See the README.

For example, in Mint v1.5.0 this works just fine:

host = "google.com"

{:ok, conn} = Mint.HTTP.connect(:https, host, 443, mode: :passive)
{:ok, conn, ref} = Mint.HTTP.request(conn, "POST", "/", [], :stream)

{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, "{}")
{:ok, conn, []} = Mint.HTTP.recv(conn, 0, 5000)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, :eof)
{:ok, conn, []} = Mint.HTTP.recv(conn, 0, 5000)

{:ok, _conn, [{:status, _, _} |_]} = Mint.HTTP.recv(conn, 0, 5000)

Maybe you can provide an example that reproduces the bug, with a real host as well in order to be working against the same? 🙃

whatyouhide commented 1 year ago

Closing this for inactivity and since it doesn't seem to be reproducible. Feel free to comment or reopen if you get an example that reproduces this!

bismark commented 1 year ago

Ok apologies for the slow response. So far the culprit has been AWS API Gateway endpoints. I spun up a dummy endpoint so you can test the differing behavior:

Mix.install([
  {:mint, "~> 1.4.0"},
  #{:mint, "~> 1.5.0"},
  {:castore, "~> 0.1.0"}
])

host = "p151u3rb44.execute-api.us-east-1.amazonaws.com"

{:ok, conn} = Mint.HTTP.connect(:https, host, 443, mode: :passive)
{:ok, conn, ref} = Mint.HTTP.request(conn, "GET", "/", [], :stream)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, "{}")
Mint.HTTP.recv(conn, 0, 5000)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, :eof)
Mint.HTTP.recv(conn, 0, 5000)
Mint.HTTP.recv(conn, 0, 5000) |> IO.inspect()

1.4 will successfully return a 404, 1.5 fails with {:protocol_error, "received invalid frame headers during handshake"}.

(I'm totally open to the issue here being that we are "using it wrong").

ericmj commented 1 year ago

You need to capture the conn value returned by Mint.HTTP.recv/3 and pass it to next call to Mint.HTTP functions. Like this:

{:ok, conn} = Mint.HTTP.connect(:https, host, 443, mode: :passive)
{:ok, conn, ref} = Mint.HTTP.request(conn, "GET", "/", [], :stream)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, "{}")
{:ok, conn, _responses} = Mint.HTTP.recv(conn, 0, 5000)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, :eof)
{:ok, conn, _responses} = Mint.HTTP.recv(conn, 0, 5000)
{:ok, conn, _responses} = Mint.HTTP.recv(conn, 0, 5000) |> IO.inspect()
bismark commented 1 year ago

Ah, silly me. I was so excited that I repro'd the issue that I didn't look carefully. I'll see if I can find a proper repro here, for now we can keep this closed.

howleysv commented 1 year ago

Having investigated further, we have identified a repro related to when we're streaming large amounts of data over the connection, here is a minimal repro which worked on mint 1.4 but now fails on 1.5.

Mix.install([
  #{:mint, "~> 1.4.0"},
  {:mint, "~> 1.5.0"},
  {:castore, "~> 0.1.0"}
])

host = "swl4ewp0s4.execute-api.us-east-1.amazonaws.com"
body = "foobar" |> Stream.duplicate(100000) |> Enum.join()
{:ok, conn} = Mint.HTTP.connect(:https, host, 443, mode: :passive)
{:ok, conn, ref} = Mint.HTTP.request(conn, "POST", "/", [], :stream)
conn = Enum.reduce(1..5, conn, fn _, conn ->
  size = min(Mint.HTTP2.get_window_size(conn, :connection), Mint.HTTP2.get_window_size(conn, {:request, ref}))
  {:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, binary_part(body, 0, size))
  {:ok, conn, _} = Mint.HTTP.recv(conn, 0, 5000)
  conn
end)
{:ok, conn} = Mint.HTTP.stream_request_body(conn, ref, :eof)
{:ok, conn, _responses} = Mint.HTTP.recv(conn, 0, 5000)

On mint 1.5, the size can be 0 and that seems to be what triggers the error. I'm not certain but I suspect the recv inside the loop is receiving the handshaking messages rather than an ack to refill the client window, resulting in the 0 window size on the next iteration. Then calling stream_request_body() with an empty body seems to trigger a server error.

I see 2 solutions here at least:

  1. mimic the old mint behaviour and loop recv until the conn leaves the handshaking state, before streaming any data
  2. in the "streaming" loop, loop recv any time the window size is 0

Beyond those though, It'd be useful to know what the recommended usage of mint is for streaming large data payloads using passive mode http2 connections.

ericmj commented 1 year ago

You need to capture the returned conn from the Enum.reduce function call.

howleysv commented 1 year ago

You need to capture the returned conn from the Enum.reduce function call.

ooo nice catch, I've updated the example. In this case the error was happening mid loop so it was still a valid repro although I'm surprised it all succeeded on the 1.4 mint