benoitc / hackney

simple HTTP client in Erlang
Other
1.34k stars 427 forks source link

Connection mixed up when using proxy #566

Closed thang911pk closed 4 years ago

thang911pk commented 5 years ago

I experienced similar issue as https://github.com/benoitc/hackney/issues/545 I use the latest version of HTTPoison and Hackney. I had 2 end points. When I use Squid proxy to send HTTPS requests, all requests go to one endpoint. Hackney reused the same connection for both end points.

## Without proxy 
HTTPoison.post!(
  "https://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool]
  ]
)

HTTPoison.post!(
  "https://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool]
  ]
)

HTTPoison.post!(
  "https://2ffcbfd7.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool]
  ]
)

HTTPoison.post!(
  "https://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool]
  ]
)

#Socket connections
:sys.get_state(:hackney_pool.find_pool(:test_pool)) 

 {{[
      [
        {'b61fad2e.ngrok.io', 443, :hackney_ssl},
        {:sslsocket, {:gen_tcp, #Port<0.18470>, :tls_connection, :undefined},
         #PID<0.503.0>}
      ]
    ], [], [], [], [], [], [], [], [], [], [], [], [], [], [],
    [
      [
        {'2ffcbfd7.ngrok.io', 443, :hackney_ssl},
        {:sslsocket, {:gen_tcp, #Port<0.19373>, :tls_connection, :undefined},
         #PID<0.523.0>}
      ]
    ]}}},
 {:dict, 2, 16, 16, 8, 80, 48,
  {[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []},
  {{[], [], [], [], [], [],
    [
      [
        {:sslsocket, {:gen_tcp, #Port<0.19373>, :tls_connection, :undefined},
         #PID<0.523.0>} |
        {{'2ffcbfd7.ngrok.io', 443, :hackney_ssl},
         #Reference<0.931103709.3369074689.153627>}
      ]
    ],
    [
      [
        {:sslsocket, {:gen_tcp, #Port<0.18470>, :tls_connection, :undefined},
         #PID<0.503.0>} |
        {{'b61fad2e.ngrok.io', 443, :hackney_ssl},
         #Reference<0.931103709.3369074692.150581>}
      ]
    ], [], [], [], [], [], [], [], []}}}, 0}

## With proxy HTTPS 

HTTPoison.post!(
  "https://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

HTTPoison.post!(
  "https://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

HTTPoison.post!(
  "https://2ffcbfd7.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

HTTPoison.post!(
  "https://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

#Socket connections
:sys.get_state(:hackney_pool.find_pool(:test_pool)) 

  {{[], [], [], [], [], [], [], [], [], [], [],
    [
      [
        {'127.0.0.1', 3128, :hackney_http_connect},
        {:hackney_ssl,
         {:sslsocket, {:gen_tcp, #Port<0.9267>, :tls_connection, :undefined},
          #PID<0.383.0>}}
      ]
    ], [], [], [], []}}},
 {:dict, 1, 16, 16, 8, 80, 48,
  {[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []},
  {{[], [], [], [], [], [], [], [], [], [],
    [
      [
        {:hackney_ssl,
         {:sslsocket, {:gen_tcp, #Port<0.9267>, :tls_connection, :undefined},
          #PID<0.383.0>}} |
        {{'127.0.0.1', 3128, :hackney_http_connect},
         #Reference<0.1120821313.1221853188.123350>}
      ]
    ], [], [], [], [], []}}}, 0}

## With proxy HTTP 

HTTPoison.post!(
  "http://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

HTTPoison.post!(
  "http://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

HTTPoison.post!(
  "http://2ffcbfd7.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

HTTPoison.post!(
  "http://b61fad2e.ngrok.io",
  "hello",
  [],
  [
    hackney: [pool: :test_pool],
    proxy: "http://127.0.0.1:3128"
  ]
)

Socket connections
:sys.get_state(:hackney_pool.find_pool(:test_pool)) 

  {{[], [], [], [], [], [], [], [], [], [], [], [], [], [], [],
    [[{'127.0.0.1', 3128, :hackney_tcp}, #Port<0.8843>]]}}},
 {:dict, 1, 16, 16, 8, 80, 48,
  {[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []},
  {{[], [], [], [], [], [], [], [], [], [], [], [], [],
    [
      [
        #Port<0.8843> |
        {{'127.0.0.1', 3128, :hackney_tcp},
         #Reference<0.3140736740.1760821249.15187>}
      ]
    ], [], []}}}, 0}
bzmw commented 5 years ago

@benoitc have you had any time to take a look at this issue?

QuinnWilton commented 4 years ago

I'm running into this issue in production too, also using a Squid proxy. Let me know if there's any additional info I can provide that would be helpful!

benoitc commented 4 years ago

hrm i was expecting the proxy would close the connection each time (which normally it should). What let you think the same connection is reused apart the state of the pool? Do you always get the same site ?

QuinnWilton commented 4 years ago

I noticed it because a request to a Heroku app returned a Microsoft Azure 404 page, and a separate request to an Azure app returned a response from an app I have running on Gigalixir. As far as I can tell, all of the responses in question came back around the same time, and ended up associated with the wrong request.

I haven't had a chance to get a minimal reproduction to confirm that it's this exact issue, but the problem went away after removing the proxy.

bzmw commented 4 years ago

I noticed it because a request to a Heroku app returned a Microsoft Azure 404 page, and a separate request to an Azure app returned a response from an app I have running on Gigalixir. As far as I can tell, all of the responses in question came back around the same time, and ended up associated with the wrong request.

This is exactly the behaviour that we saw. We got around it by disabling connection pooling in hackney entirely. Which sadly caused a huge jump in CPU usage, but was something we had to accommodate.

benoitc commented 4 years ago

The issue is that the connection of the proxy is not closed at the end of the session which should probably be, a trace should confirm it. Can one of you enable tracing in hackney and send me back a session. You can do it privately by mail if you need.

I may have a solution by monday.

QuinnWilton commented 4 years ago

I've been trying to get a trace for you, but I'm having trouble reproducing it reliably in an environment where I have tracing enabled. I'll let you know if I get it working though. It's something I've seen in our logs, but not something I've managed to reproduce manually.

MichaelViveros commented 4 years ago

I reproduced it locally by sending requests to 2 different ngrok endpoints (which tunneled to 2 local servers) and using a squid proxy container. The first local server returns url 1 while the second returns url 2 and the code breaks when it receives an unexpected response from a server which indicates the request was sent to the wrong endpoint (Ex. server 1 returning url 2).

proxy_request_1 = {"https://466c67ef.ngrok.io/health", "url 1", [proxy: "http://127.0.0.1:3128"]}
proxy_request_2 = {"https://4c2caca3.ngrok.io/health", "url 2", [proxy: "http://127.0.0.1:3128"]}
requests = [proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2, proxy_request_1, proxy_request_2]
for {url, expected_response, opts} = req <- requests do
  IO.puts(inspect(req))
  case HTTPoison.post!(url, "", [], opts) do
    %HTTPoison.Response{body: ^expected_response} -> IO.puts("correct endpoint")
    %HTTPoison.Response{body: body} -> 
      IO.puts("wrong endpoint - url #{url}, expected_response #{expected_response}, actual_response #{body}")
      raise "wrong endpoint"
  end
end

Sample output:

{"https://4c2caca3.ngrok.io/health", "url 2", [proxy: "http://127.0.0.1:3128"]}
correct endpoint
{"https://466c67ef.ngrok.io/health", "url 1", [proxy: "http://127.0.0.1:3128"]}
correct endpoint
{"https://4c2caca3.ngrok.io/health", "url 2", [proxy: "http://127.0.0.1:3128"]}
correct endpoint
{"https://466c67ef.ngrok.io/health", "url 1", [proxy: "http://127.0.0.1:3128"]}
wrong endpoint - url https://466c67ef.ngrok.io/health, expected_response url 1, actual_response url 2
** (RuntimeError) wrong endpoint

Note that the bug was transient - sometimes I'd have to run the script multiple times for it to occur.

I have to step out for awhile but I can look into getting more tracing later today.

MichaelViveros commented 4 years ago

You could even use a site like https://httpstat.us/ which returns different status codes depending on the route instead of using ngrok + local servers.

MichaelViveros commented 4 years ago

Actually that probably wouldn’t work since the requests would be for the same host

MichaelViveros commented 4 years ago

Here's a trace. trace.txt

benoitc commented 4 years ago

@MichaelViveros thanks for the trace! So it's clear there what is happening. Since the response is not closing the connection (no close header), hackney will keep that connection to the proxy host open and reuse it. A quick fix is probably to probably send a Close header while sending the request. On the long term it may be better to have a separate proxy pool. I will let you know when the first one is done.

MichaelViveros commented 4 years ago

Awesome! And for my own knowledge, why does the connection with the proxy have to be closed? I thought connection pooling meant you kept the connection open and reused it. Perhaps the connection to the proxy is actually a connection to the proxy + destination server which you wouldn’t want to reuse?

On Sun, Nov 3, 2019 at 8:26 AM Benoit Chesneau notifications@github.com wrote:

@MichaelViveros https://github.com/MichaelViveros thanks for the trace! So it's clear there what is happening. Since the response is not closing the connection (no close header), hackney will keep that connection to the proxy host open and reuse it. A quick fix is probably to probably send a Close header while sending the request. On the long term it may be better to have a separate proxy pool. I will let you know when the first one is done.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/benoitc/hackney/issues/566?email_source=notifications&email_token=ACDUVC2GNXL773PP5ACI6QTQR3GSFA5CNFSM4HNOGSIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5SRJQ#issuecomment-549136550, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDUVC3OZMT3OGQEE33DUHDQR3GSFANCNFSM4HNOGSIA .

benoitc commented 4 years ago

Closing a connection ease the management of a sesion. It used (o be the default on proxes lique squid. I think a better solution is explicitley naming a connection as tunneled. Like http/proxyhost/proxyport/http/targethost/targetport if it follows the multiaddr format.

MichaelViveros commented 4 years ago

Any update @benoitc?

MichaelViveros commented 4 years ago

Any update @benoitc?

benoitc commented 4 years ago

@MichaelViveros a branch will land over the week-end...

MichaelViveros commented 4 years ago

Awesome, merci!

liamwhite commented 4 years ago

Are there any workarounds that we could use in the meantime, before a new version is cut with the fix? (Perhaps on the proxy side?)

benoitc commented 4 years ago

Are there any workarounds that we could use in the meantime, before a new version is cut with the fix? (Perhaps on the proxy side?)

you can set an Header "Connection: close" to your requests. That should fix the issue temporarely. I expect to have that new version ready this we anyway.

benoitc commented 4 years ago

@MichaelViveros so the fix will land by friday, i have been busy in refactoring some parts to allow to isolate the proxy.

liamwhite commented 4 years ago

:crossed_fingers:

benoitc commented 4 years ago

So I toyed with some solutions during the we, and it seems a quick fix is not possible. I will come with a plan later today with a possible fix during the week.

right now when you read the body with no redirection, the code should actually work until you make sure to always use the same proxy.

liamwhite commented 4 years ago

Any progress on this?

benoitc commented 4 years ago

@liamwhite yes. a batch of updates will be published soon.

benoitc commented 4 years ago

fixed in master finally...