HTTP/2 dont try reconnect when domain does not exists

thenrio commented 3 months ago

When default pool uses HTTP/2, a GET to an non existing domain produces repeated warning.

iex(1)> {:ok, pid} = Finch.start_link(name: :test, pools: %{default: [protocols: [:http2]]})
{:ok, #PID<0.4392.0>}
iex(2)> Finch.build(:get, "https://enoe.nt") |> Finch.request(:test)

17:43:57.477 [warning] Failed to connect to https://enoe.nt:443: non-existing domain
{:error, %Finch.Error{reason: :disconnected}}

17:43:58.171 [warning] Failed to connect to https://enoe.nt:443: non-existing domain

17:43:58.235 [warning] Failed to connect to https://enoe.nt:443: non-existing domain

Note that when :http1, we don't have such retries.

iex(1)> {:ok, pid} = Finch.start_link(name: :test, pools: %{default: [protocols: [:http1]]})
{:ok, #PID<0.258.0>}
iex(2)> Finch.build(:get, "https://enoe.nt") |> Finch.request(:test)
{:error, %Mint.TransportError{reason: :nxdomain}}

I think we should not retry on domain error in HTTP/2. Cheers.

Also see https://github.com/wojtekmach/req/issues/363#issuecomment-2161408803.

sneako commented 3 months ago

Thanks for reporting!

I guess this is due to the difference in our H1 & H2 pools, where H1 opens connections on demand and tries to keep them open as long as possible, but H2 connections are more persistent and we try to open them all right away and keep them open.

I don't think we should change that behaviour, so I'm not sure what the best solution is. nxdomain can be a transient error, for example if DNS hasn't propagated yet. But I can imagine for use cases where arbitrary domains are being requested, you probably don't want these errors logged (and you probably also don't want the pool to stick around).

Maybe we should consider adding a new callback to the Pool behaviour:

@callback shutdown(url :: String.t() | URI.t()) :: :ok | {:error, reason()}

and then expose that on the top level Finch module, perhaps as Finch.shutdown_pool/1.

We could also return %Finch.Error{reason: :nxdomain} instead of the current :disconnected error, and then you could use that to decide whether or not you want to shut down the pool?

wojtekmach commented 3 months ago

For completeness, :nxdomain also shows up on IPv4/IPv6 mismatch. One example that comes to mind is talking to internal Fly service (ipv6-only) from ipv4-by-default connection.

sneako / finch

HTTP/2 dont try reconnect when domain does not exists #276