Closed travisp closed 4 years ago
@schneems because right now Puma is the recommended web server on heroku, so is it reliable to use it now? or better not and wait until this issue solve?
You can use this now, but it's at your own risk. Unicorn Worker Killer has the exact same bug and people have been using it on Heroku for years. I'll remove the warning from the readme when this gets resolved in a sane way. Feel free to experiment until then.
Thank @schneems, I will try it out. I'm a bit confused here between puma_worker_killer vs https://github.com/schneems/puma_auto_tune, we shouldn't use both at the same time? which one is your recommendation?
Puma Auto Tune does everything that PWK does + some. It will also cause your app to swap memory if you use it on Heroku guaranteed. Don't use Puma Auto Tune on Heroku right now. PWK is fine, but understand that it's not perfect, i.e. if you set your app to 512mb of RAM, it will start killing workers at about ~350mb of RAM. PWK outputs how much ram it thinks your system is using in a librato compatible format to the logs, you can manually compare that against actual RAM usage from Heroku's log runtime metrics and adjust.
if you set your app to 512mb of RAM, it will start killing workers at about ~350mb of RAM.
That's what I got here, and it keeps killing worker and restarting it. Does it make sense to set percent_usage
greater than 100%
here because of incorrect memory reporting of PWK?
That's one angle. Maybe shoot for 120%. Don't try to get too close. If you go over, then PWK won't kill workers when you need it to. Also realize that PWK is a bandaid for larger memory problems, it doesn't solve them, just covers them up.
Pretty unstable. Right now heroku memory go exceeded memory over 1 GB, but PWK reports only about ~600MB. And it didn't kill the workers as well.
PumaWorkerKiller.config do |config|
config.ram = 512 # mb
config.frequency = 5 # seconds
config.percent_usage = 1.20
end
PumaWorkerKiller.start
» 10:18:59.685 2015-02-05 03:18:59.625773+00:00 heroku web.1 - - Process running mem=1767M(345.2%)
» 10:18:59.761 2015-02-05 03:18:59.625820+00:00 heroku web.1 - - Error R14 (Memory quota exceeded) Critical
» 10:18:59.869 2015-02-05 03:18:59.625257+00:00 heroku web.1 - - source=web.1 dyno=heroku.21274089.e2b6196c-6736-47c5-bcfd-8cd6393289ae sample#load_avg_1m=0.00 sample#load_avg_5m=0.02 sample#load_avg_15m=0.04
» 10:18:59.945 2015-02-05 03:18:59.625357+00:00 heroku web.1 - - source=web.1 dyno=heroku.21274089.e2b6196c-6736-47c5-bcfd-8cd6393289ae sample#memory_total=1767.64MB sample#memory_rss=501.53MB sample#memory_cache=0.00MB sample#memory_swap=1266.11MB sample#memory_pgpgin=1217595pages sample#memory_pgpgout=1089204pages
» 10:19:02.802 2015-02-05 03:19:02.516087+00:00 app web.1 - - [3] PumaWorkerKiller: Consuming 594.34765625 mb with master and 2 workers
Make sure you're using version 0.0.3 or master.
Consuming 594.34765625 mb with master and 2 workers
This should have triggerd a kill cycle.
if (total = get_total_memory) > @max_ram
@cluster.master.log "PumaWorkerKiller: Out of memory. #{@cluster.workers.count} workers consuming total: #{total} mb out of max: #{@max_ram} mb. Sending TERM to #{@cluster.largest_worker.inspect} consuming #{@cluster.largest_worker_memory} mb."
@cluster.term_largest_worker
else
@cluster.master.log "PumaWorkerKiller: Consuming #{total} mb with master and #{@cluster.workers.count} workers"
end
where @max_ram = ram * percent_usage
which should be 614 mb
.
Yep, I was using 0.0.3
. You are right 120% is about 614 mb
, but it seems heroku memory goes 1767M
already, but PWK still reports 594.34765625 mb
, that's why it didn't kill the workers yet.
check your version of get_process_mem should be 0.2.0
Yes, it is.
$ bundle show puma
.gems/gems/puma-2.11.0
$ bundle show puma_worker_killer
.gems/gems/puma_worker_killer-0.0.3
$ bundle show get_process_mem
.gems/gems/get_process_mem-0.2.0
Weird. This is how we get the memory usage
def get_total(workers = set_workers)
master_memory = GetProcessMem.new(Process.pid).mb
worker_memory = workers.map {|_, mem| mem }.inject(&:+) || 0
worker_memory + master_memory
end
My best bet is that you have something else running, maybe a separate binary or program, that is using up memory in a different process and PWK can't see it. If you're shelling out a bunch using backticks or Process.spawn PWK won't see it. Again, this is just yet another reason why it's "use at your own risk"-ware for now. Thanks for giving it a shot. Unfortunately the introspection tools on containers are just so limited.
In my code, I don't use anything to start any sub processes, but I'm not sure about other third parties that I'm using. Some of them are pubnub
, newrelic
, sidekiq
. As far I can tell from the response time, I see puma is faster. I haven't done any benchmark myself yet.
Another thing, when I was using unicorn I see the memory doesn't keep growing like that much. I'm not sure because of this https://github.com/puma/puma/issues/342
I think heroku recommends puma is the good choice and definitely the direction to go. Thanks, and hope you guys will find a way for this memory soon :smile:
What ever the value i set for config.ram the gem is taking it as 512 and restricting it to use 335 mb ram only. I check the value in rails console PumaWorkerKiller.ram => 4096 still the cut out is working on 512. I hvae done the default configuration which is working just that not taking the new config in the config/puma.rb or config/initializers/puma_worker_killer.rb
@chetan-wwindia are you on Heroku? Make sure that ram is set before the worker killer is "started". If this reproduces locally can you give me an example app that shows the problem?
@schneems I using this on aws ubuntu server
Can you give me the code you're using to set the values? Does it work locally?
On server its nginx + puma with 4 puma workers
It keeps piling on ram up to 6 gb. I m also dealing with memory lead
this is my temp solution till I find solution to memory leak .
Doesn't work on local PumaWorkerKiller.config do |config| config.ram = 4096 # mb config.frequency = 10 # seconds config.percent_usage = 0.80 config.rolling_restart_frequency = 3 * 3600 # 12 hours in seconds end PumaWorkerKiller.start
Any update to report here? I'm curious of PWK can report correct memory consumption on Heroku dynos yet, or if there is some mild success using the memory definitions?
Check the readme. Will not work on Heroku until LXC exposes memory use inside of the container. So likely never. Use rolling restarts or performance dynos.
Thanks, yeah I went with rolling restarts. So then it sounds like this issue should be re-closed.
This isn't a problem on Performance Dynos, then? How about Shield Dynos?
perf, private, and shield dynos are all run on their own VPC so numbers should be correct.
Was having issues with frequent cycling on a heroku project and noticed numbers like this in the logs (shortly after I restarted all of the web dynos)
2014-08-15T14:21:17.222622+00:00 app[web.3]: [2] PumaWorkerKiller: Consuming 429.48046875 mb with master and 2 workers 2014-08-15T14:21:18.443592+00:00 heroku[web.3]: source=web.3 dyno=heroku.15698018.ca54adb5-3f63-4006-ae9f-c0f235c53288 sample#load_avg_1m=0.00 sample#load_avg_5m=0.00 2014-08-15T14:21:18.443937+00:00 heroku[web.3]: source=web.3 dyno=heroku.15698018.ca54adb5-3f63-4006-ae9f-c0f235c53288 sample#memory_total=313.75MB sample#memory_rss=313.75MB sample#memory_cache=0.00MB sa
Basically, it seems to be vastly overestimating the amount of memory actually used. This is using the latest code from master, including get_process_mem 0.2.0