mtrudel / bandit

Bandit is a pure Elixir HTTP server for Plug & WebSock applications
MIT License
1.67k stars 80 forks source link

High Memory usage for DelegatingHandler.init/1 #339

Open jaronoff97 opened 5 months ago

jaronoff97 commented 5 months ago

Hello! I'm using bandit as my websocket adapter for a phoenix project where I'm streaming data from a server to the client and I'm noticing incredibly high memory usage. When i look at the dashboard's process tab I can see it's caused by the DelegatingHandler.init/1... Screenshot 2024-04-10 at 12 23 33 PM Screenshot 2024-04-10 at 12 22 48 PM

I would expect init to only be called once, so I'm surprised to see it present here

jaronoff97 commented 5 months ago

FWIW, i am streaming a HUGE amount of data from the server to the client, but I would expect the socket to transfer it and then drop it.

mtrudel commented 5 months ago

Interesting! Versions of Bandit and Thousand Island?

jaronoff97 commented 5 months ago :)

also, it's totally possible this is expected btw for a very high load socket but wanted a gut check because this seemed really high

mtrudel commented 5 months ago

It's not expected that Bandit's memory usage grows long-term based on the volume of data sent (either by HTTP or WebSocket). There may be places in the lower networking stack that are keeping some buffers around, but they should not be growing without bound. Part of me wonders if it isn't in Phoenix's channels code.

A couple things that would help:

jaronoff97 commented 5 months ago

i will attempt all this when i get a chance :) pretty slammed rn. FWIW i swapped from cowboy to bandit because the bandit performance was noticeably better, but I did see similar issues there. Maybe it is something in phoenix... are there traces i can emit from the system?

mtrudel commented 5 months ago

If you think you've seen similar issues in Cowboy that's the first thing to validate. Given the relative complexity of Phoenix compared to Bandit/Cowboy I think it's much more likely the issue is there.

aaronrenner commented 5 months ago

I also deployed bandit yesterday and saw memory usage triple compared to running cowboy a week ago (the dotted line)

Screenshot 2024-04-25 at 11 00 29 AM

I looked in LiveDashboard and saw the DelegatingHandler.init/1 processes with 60+ mb of memory. These processes used to use 20mb of memory with cowboy.

Screenshot 2024-04-25 at 12 36 39 PM

I can tell they're tied to websockets because they're either linked to absinthe or phoenix channel processes:

Screenshot 2024-04-25 at 12 37 26 PM Screenshot 2024-04-25 at 12 40 00 PM

I'm running the following versions:

I haven't disabled compression yet, but I'll try that next.

aaronrenner commented 5 months ago

I disabled websocket compression with the following configuration and it did cause a little dip in memory usage, but not much.

config :my_app_web, MyAppWeb.Endpoint,
  adapter: Bandit.PhoenixAdapter,
  http: [
    websocket_options: [
      compress: false

The dotted line is the cowboy memory usage from a week ago.

Screenshot 2024-04-25 at 6 23 49 PM

Other than the increased memory usage on web sockets, bandit has been performing well for us.

mtrudel commented 5 months ago

Sorry for the late response!

I think this may rhyme with #313 and #322. I've pushed up a branch that does a GC pass before switching the handler from HTTP/1 to WebSocket; would you be able to test

{:bandit, "~> 1.0", github: "mtrudel/bandit", branch: "gc_on_websocket"}

and see if that improves matters?

aaronrenner commented 5 months ago

Thanks @mtrudel. That helped significantly.

Screenshot 2024-05-01 at 11 51 06 AM

You can see where I deployed and how it settled at 3.2gb of memory instead of 6.5gb. The top line is the memory usage from the same time yesterday and the bottom line is memory usage from a month ago when we were still running cowboy. I'm also still running with websocket compression off.

mtrudel commented 5 months ago

Good news, but it seems like we still have work to do! Now that I know this line is producing good results, let me try adding a few more things to this branch for experimentation. We should be able to get well below the Cowboy line, all else being equal.

mtrudel commented 5 months ago

Actually, could you switch back to cowboy for a spell just to make sure that all things are equal? It's not impossible that that growth in memory usage is due to something other than Bandit/Cowboy.

mtrudel commented 5 months ago

It'd also be useful to know what things look like with compression re-enabled

aaronrenner commented 5 months ago

Here's the difference in memory usage going from Bandit back to cowboy.

Screenshot 2024-05-01 at 1 10 43 PM

In our case it's still about 1Gb difference.

mtrudel commented 5 months ago

This is all / predominantly websocket load?

aaronrenner commented 5 months ago

These servers are handling websocket + graphql requests. The majority of the load is graphQL requests, but when I look at LiveDashboard, all of the processes with high memory usage are websockets.

mtrudel commented 5 months ago

@aaronrenner try the {:bandit, "~> 1.0", github: "mtrudel/bandit", branch: "gc_on_websocket"}

branch again? I've added pict filtering between requests

mtrudel commented 4 months ago

@jaronoff97 would you be able to take a look at phoenix 1.5.2 and see to what extent this resolves your issues? I've been working with @aaronrenner on Slack and think we've solved his issues (at least these ones), and I'd like get this issue moved along for your original query

mtrudel commented 4 months ago

I have a lead on the remaining performance deficit that @aaronrenner identifies in his most recent post above; I can identify a similar spread in memory usage (an extra 20-25% over Cowboy) and reproduce it locally. Working it over on #345.

jaronoff97 commented 4 months ago

@mtrudel sure! I'll give that a try and run a load test and get back to you by tomorrow! Thank you :D

mtrudel commented 4 months ago

My hunch is that you'll see the same gain @aaronrenner did, but that there will still be a ~20-25% deficit vs Cowboy (that's what I'm working on now).

If you have any similarly comparative graphs for CPU & scheduler utilization that would also be appreciated.

ryanwinchester commented 4 months ago

@jaronoff97 would you be able to take a look at phoenix 1.5.2

phoenix bandit 1.5.2

mtrudel commented 4 months ago

@aaronrenner when you posted your most recent chart on May 1 that showed an improvement but still a ~1Gb difference, do you know if you had compression enabled or not in the bandit & cowboy cases?

aaronrenner commented 4 months ago

@mtrudel I just confirmed that compression was disabled for both cowboy (I believe this is the default) and bandit. The web server was changed via an environment variable in config/runtime.exs so here is the difference in the settings:

 case System.get_env("WEB_SERVER", "cowboy") do
    "bandit" ->
      config :my_app, MyAppWeb.Endpoint,
        adapter: Bandit.PhoenixAdapter,
        http: [
          websocket_options: [
            compress: false

    "cowboy" ->
      config :my_app, MyAppWeb.Endpoint, adapter: Phoenix.Endpoint.Cowboy2Adapter