zombocom / puma_worker_killer

Automatically restart Puma cluster workers based on max RAM available
748 stars 77 forks source link

Seems not working on Puma 2.11.3 #15

Closed larryzhao closed 4 years ago

larryzhao commented 9 years ago

I am running rails 4.2.1 on puma 2.11.3 on ruby 2.0.0-p643. And I am using cluster mode, I have 8 workers each server.

But memory still increases and eventually eats up all my memory. ram and percent_usage in configuration both mean total ram on my server right?

I added some logs and found that getting workers from Puma::Cluster is empty. https://github.com/schneems/puma_worker_killer/blob/master/lib/puma_worker_killer/puma_memory.rb#L67

But I think I've configured it to run in cluster right?

in puma.rb

workers 8

and

$ ps -ef | grep puma
deployer   923  5685  2 17:14 ?        00:00:10 puma: cluster worker 0: 5685 [maleskine]
deployer   963  5685  2 17:14 ?        00:00:10 puma: cluster worker 1: 5685 [maleskine]
deployer  1064 29862  0 17:21 pts/2    00:00:00 grep --color=auto puma
deployer  5685     1  0 11:01 ?        00:00:02 puma 2.11.3 (unix:///tmp/maleskine.sock) [maleskine]

Is there anything I've done wrong?

schneems commented 9 years ago

Was this working before?

PWK outputs structured logs, so you can see how much memory it thinks your system is consuming, you can also see when it kills a worker. It's possibly (read likely) that the RSS values being read are lower than the actual memory being consumed on your whole system. Your logs should give your more info.

If you're not getting any logging output, and your debugging is showing that puma doesn't detect workers, it might be possible that the thread that is spawned when PWK is invoked was terminated. I put my code in a Rails initializer.

larryzhao commented 9 years ago

Thanks @schneems for the response.

  1. I didn't see any log output by PWK, in which log file shall I find it? Rails log, Puma stdout or a separated log?
  2. Yes, as I debugged into, Puma Cluster reports no workers. I also put my PWK init code in a Rails initializer, how could I dig further for this part?

Thank you very much.

schneems commented 9 years ago

PWK uses the same log as your master puma instance: https://github.com/schneems/puma_worker_killer/blob/3442c1237b44885acb30da70b0ee4df9868f9a13/lib/puma_worker_killer/reaper.rb#L16

So whatever output gets the:

12:46:28 web.1  | [5807] Puma starting in cluster mode...
12:46:28 web.1  | [5807] * Version 2.11.1 (ruby 2.2.2-p95), codename: Intrepid Squirrel
12:46:28 web.1  | [5807] * Min threads: 8, max threads: 32
12:46:28 web.1  | [5807] * Environment: development
12:46:28 web.1  | [5807] * Process workers: 4
12:46:28 web.1  | [5807] * Preloading application
12:46:28 web.1  | [5807] * Listening on tcp://0.0.0.0:9292

Should also get the PWK logs.

To debug I would run this locally, use bundle open puma_worker_killer and add puts statements to see what code gets called and what doesn't. Make sure that you are booting workers after your rails app boots using preload_app! https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#preload-app

larryzhao commented 9 years ago

Hi, @schneems I am testing it locally and on my staging sever.

I get no log output from PWK, it's strange. When I try it locally, I am not getting it in the term nor in puma stdout. Strange.

Yes I added Rails.logger.warn statements to check that. But I was not using preload_app! becasue preload_app! prevents phase restart so I tend to not use preload_app! to benefit from phase restart

Is it because not using preload_app! ?

schneems commented 9 years ago

Is it because not using preload_app! ?

I don't know. I've never used it without preload_app!. There may be a way to use PWK without that flag, but i've never done it.

larryzhao commented 9 years ago

Hi, @schneems ,

I've created a fresh rails application with nothing else, and test pwk. Still, for @cluster.running? I get false. From what I logged, there's only one Puma::Cluster but anytime @master.instance_variable_get("@workers") returns an empty array.

Would you help me to see if anything I've done wrong? The application is here: https://github.com/larryzhao/test-pwk . I start puma with config/puma.rb and I've tested it under MRI 2.2.1 and 2.0.0. The problem exists under both ruby installations.

Thank you very much.

schneems commented 9 years ago

I think the issue you're seeing is that you're not preloading the app, so the first time @master.instance_variable_get("@workers") is run there won't be any workers because they haven't been booted yet. (i'm guessing)

I think the fix would be to write the code so that if workers ever returns an empty array we try to re-establish a reference to @master and then get the workers, but again i'm not sure. I'll be back from paternity leave after July 15th can you ping me again then and I can look into it?

larryzhao commented 9 years ago

Sure, Thanks a lot. I will ping you then, and congratulations to the new born! :smile:

I just try adding preload_app!, it is working.

And one more thing, by try to re-establish a reference to @master, do you mean: ObjectSpace.each_object(Puma::Cluster).map { |obj| obj }.first if defined?(Puma::Cluster)?

larryzhao commented 9 years ago

Just found it might because of daemonize. If I ran it undaemonize I will get PWK printing logs and everything, but if I start it with -d. Everything's gone.

larryzhao commented 9 years ago

@schneems Hi, Richard, Sorry for the bothering. It has passed the 15th and maybe you have time to look at this. I found that it might relate to running puma in daemon mode, if I don't run puma in daemon mode, everything is fine.

schneems commented 9 years ago

I've got your example app downloaded: https://github.com/larryzhao/test-pwk

What is the exact command you're using to run it to see the problem?

larryzhao commented 9 years ago

@schneems I am redirecting stdout to log/puma_stdout.log, you could see in config/puma.rb.

And if I run bundle exec puma -C config/puma.rb -e development under development and without daemonizing, I could see from the log (I bundle open to print out more) that PWK is running with the interval specified and getting the puma cluster and workers correctly.

If I run bundle exec puma -C config/puma.rb -e development -d, nothing's printing out as if PWK is not running at all.

Thanks a lot for the help.

How I change the gem to print more:

in reaper.rb:

    def reap
      puts "[#{Time.now}] Here I reap............."
      puts "Cluster running?: #{@cluster.running?}"
      return false unless @cluster.running?
      if (total = get_total_memory) > @max_ram
        @cluster.master.log "PumaWorkerKiller: Out of memory. #{@cluster.workers.count} workers consuming total: #{total} mb out of max: #{@max_ram} mb. Sending TERM to #{@cluster.largest_worker.inspect} consuming #{@cluster.largest_worker_memory} mb."
        @cluster.term_largest_worker
      else
        @cluster.master.log "PumaWorkerKiller: Consuming #{total} mb with master and #{@cluster.workers.count} workers"
      end
    end

and in puma_memory.rb:

    def running?
      puts "master: #{@master}"
      puts "workers: #{workers.any?}"
      @master && workers.any?
    end
sudara commented 7 years ago

For folks stumbling on this open issue in 2017 and beyond:

https://github.com/schneems/puma_worker_killer/commit/e2c05a0603e6bc433a3802d8d58427878f8c8fe0 fixed/clarified the ability to run puma_worker_killer when not using preload_app! in puma (aka, when doing phased restarts). See https://github.com/schneems/puma_worker_killer/issues/23#issuecomment-227249377