mtrudel / bandit

Bandit is a pure Elixir HTTP server for Plug & WebSock applications
MIT License
1.64k stars 78 forks source link

Advice for dealing with `(Bandit.HTTPError) closed` not covered by `log_protocol_errors` #378

Closed martosaur closed 1 month ago

martosaur commented 1 month ago

Hi πŸ‘‹ and thank you for the great library!

We recently switch to Bandit and face a bunch of new errors. That wasn't unexpected as numerous issues in this repo warned us that Bandit is a lot more explicit with errors than Cowboy, which is great!

After looking through https://github.com/mtrudel/bandit/issues/366 we went ahead and set log_protocol_errors: false in settings. However, one of the more frequent errors doesn't seem to be covered by the setting. It happens in dev every time automatic code reload is triggered, here's a stacktrace:

** (Bandit.HTTPError) closed
      (bandit 1.5.5) lib/bandit/http1/socket.ex:418: Bandit.HTTPTransport.Bandit.HTTP1.Socket.request_error!/2
      (bandit 1.5.5) lib/bandit/http1/socket.ex:355: Bandit.HTTPTransport.Bandit.HTTP1.Socket.send_data/3
      (bandit 1.5.5) lib/bandit/adapter.ex:242: Bandit.Adapter.send_data/3
      (bandit 1.5.5) lib/bandit/adapter.ex:140: Bandit.Adapter.send_resp/4
      (plug 1.16.0) lib/plug/conn.ex:444: Plug.Conn.send_resp/1
      (myapp 1.0.0-dev) lib/plug/health.ex:8: Myapp.Plug.Health.call/2
      (myapp 1.0.0-dev) lib/myapp_web/endpoint.ex:1: Myapp.Web.Endpoint.plug_builder_call/2
      (myapp 1.0.0-dev) lib/myapp_web/endpoint.ex:1: Myapp.Web.Endpoint."call (overridable 3)"/2
      (myapp 1.0.0-dev) deps/plug/lib/plug/debugger.ex:136: Myapp.Web.Endpoint."call (overridable 4)"/2

Here's a healthcheck code for clarity, it's very simple:

  def call(%Plug.Conn{request_path: "/healthcheck"} = conn, _opts) do
    conn
    |> send_resp(200, "OK\n")
    |> halt()
  end

What do you think is the best course of action here?

Bandit 1.5.5, Elixir 1.16.3, OTP 26

mtrudel commented 1 month ago

This shouldn't be happening.

Is that the full extent of the stacktrace? The relevant logic to filter out these errors is deeper down in the stack, within the Bandit.Pipeline logic. If you have a more complete stack trace it would be appreciated!

Could you also share what your config looks like (at least, the parts related to log_protocol_errors?

mjm commented 1 month ago

Here's a more full stacktrace:

21:44:05.160 [phx]  ** (Bandit.HTTPError) closed
    (bandit 1.5.5) lib/bandit/http1/socket.ex:418: Bandit.HTTPTransport.Bandit.HTTP1.Socket.request_error!/2
    (bandit 1.5.5) lib/bandit/http1/socket.ex:355: Bandit.HTTPTransport.Bandit.HTTP1.Socket.send_data/3
    (bandit 1.5.5) lib/bandit/adapter.ex:242: Bandit.Adapter.send_data/3
    (bandit 1.5.5) lib/bandit/adapter.ex:140: Bandit.Adapter.send_resp/4
    (plug 1.16.0) lib/plug/conn.ex:444: Plug.Conn.send_resp/1
    (myapp 1.0.0-dev) lib/plug/health.ex:8: Myapp.Plug.Health.call/2
    (myapp 1.0.0-dev) lib/myapp_web/endpoint.ex:1: Myapp.Web.Endpoint.plug_builder_call/2
    (myapp 1.0.0-dev) lib/myapp_web/endpoint.ex:1: Myapp.Web.Endpoint."call (overridable 3)"/2
    (myapp 1.0.0-dev) deps/plug/lib/plug/debugger.ex:136: Myapp.Web.Endpoint."call (overridable 4)"/2
    (myapp 1.0.0-dev) lib/myapp_web/endpoint.ex:1: Myapp.Web.Endpoint.call/2
    (phoenix 1.7.12) lib/phoenix/endpoint/sync_code_reload_plug.ex:22: Phoenix.Endpoint.SyncCodeReloadPlug.do_call/4
    (bandit 1.5.5) lib/bandit/pipeline.ex:124: Bandit.Pipeline.call_plug!/2
    (bandit 1.5.5) lib/bandit/pipeline.ex:36: Bandit.Pipeline.run/4
    (bandit 1.5.5) lib/bandit/http1/handler.ex:12: Bandit.HTTP1.Handler.handle_data/3
    (bandit 1.5.5) lib/bandit/delegating_handler.ex:18: Bandit.DelegatingHandler.handle_data/3
    (bandit 1.5.5) /opt/myapp/deps/thousand_island/lib/thousand_island/handler.ex:411: Bandit.DelegatingHandler.handle_continue/2
    (stdlib 5.2.3) gen_server.erl:1085: :gen_server.try_handle_continue/3
    (stdlib 5.2.3) gen_server.erl:995: :gen_server.loop/7
    (stdlib 5.2.3) proc_lib.erl:241: :proc_lib.init_p_do_apply/3

and the relevant part of the config for the endpoint:

[
  adapter: Bandit.PhoenixAdapter,
  http: [port: 4000, http_options: [log_protocol_errors: false]],
  ...
]

Let us know if you need more info from us.

mtrudel commented 1 month ago

I wonder if they're not coming from Phoenix? That code path within Bandit is the golden path for error suppression, and is under test (on a bare Plug, without Phoenix being involved). Let me try and repro this with Phoenix in the loop.

martosaur commented 1 month ago

I created a blank phoenix project and tried to reproduce it. Good news: I did and here's a repo https://github.com/martosaur/bandit_repro Bad news: simple interval requests with an http client (I use Insomnia) wasn't enough, so I had to throw haproxy in there to get it closer to our actual setup. I hope this helps!

P.S. I realized I should probably describe what's in the repo. Here's an overview:

  1. It's a blank Phoenix project created with mix phx.new bandit_repro --no-html --no-assets --no-ecto --no-dashboard --no-gettext --no-live --no-mailer
  2. There is a module-based plug for health checks BanditRepro.Plug.Health
  3. The app runs in a docker container according to docker-compose.yml definition. We mount local code as volume to enable code reload, but in general nothing too fancy here.
  4. Alongside the app, there is also a haproxy container, which hits app:4000/healthz every second. The haproxy config is mounted from config/dev/haproxy.cfg
mtrudel commented 1 month ago

Great! I'll get to work on a fix (or at least, confirming that this is an upstream Phoenix issue)

andyleclair commented 1 month ago

Hi! Chiming in here to say that we are having a similar error. I'm not sure if it's related, I'm happy to open a new issue if you'd prefer. I haven't tried adding http_options: [log_protocol_errors: false] yet but I will on Monday when I'm back at work

Screenshot from our Sentry: image

Munksgaard commented 1 month ago

I'm also experiencing this issue. Here is my sentry stacktrace:

screenshot

mtrudel commented 1 month ago

In terms of the sentry items, you'll need to be using a version of the sentry library >= 10.6.2 (one that includes this commit). You'll also need to be running a Bandit >= 1.5.5 to ensure that Bandit is tagging its logs with the relevant domain.

Please verify that those conditions ARE met and you're still seeing the above errors in Sentry. If so, that's a bug.

@martosaur you didn't happen to have Sentry in your logging stack, did you? How were you sourcing the logs you'd sent originally?

andyleclair commented 1 month ago

We are running Bandit 1.5.5 and Sentry 10.6.2 and we're still seeing it

martosaur commented 1 month ago

@mtrudel we do use Sentry (currently on version 10.6.1), but we see the error in dev specifically during hot reload, so the Sentry reporting if disabled. Also, the repo I linked doesn't use Sentry, so there might be multiple issues at play here?

mtrudel commented 1 month ago

there might be multiple issues at play here?

I think there are indeed two issues here. I've verified that Phoenix's debug_errors flag (which is set in dev by default) will do its own logging of any errors raised within a Plug call (including, for example, Bandit.HTTPErrors as the original issue notes). This is totally distinct from Bandit's log_protocol_errors setting, and is actually accomplished behind the scenes by the Plug.Debugger module here. While there may be something to filing this against Plug / Phoenix, there's nothing to do on the part of Bandit; we're working as expected.

In terms of sentry, this would also appear to be an upstream issue. Bandit 1.5.5 already sets the relevant domain on its logging, and the relevant changes were made upstream in Sentry here. Unsure what to add to this here.

andyleclair commented 1 month ago

FWIW we ended up just ignoring these errors in our code in a before_send

martosaur commented 1 month ago

@mtrudel thanks for looking into this! So as I understood this is legit error which is correctly reported by Plug.Debugger. It's just that under certain circumstances, such as frequently run healthcheck requests it becomes noticeable and annoying.

One workaround I found is to replace conn |> send_resp(...) |> halt() with conn |> resp(...) |> halt(). In this case the response will be sent by Bandit and not wrapped in Plug.Debugger call, which will effectively silences the error.