Closed rgaufman closed 1 year ago
I noticed strange behaviour, when starting puma and good_job, both jump to 30% ram use. Then Puma starts to drop quite quickly and ends at around 9%. Good_job stays at around 29% (Sidekiq before was roughly the same as Puma with the same workers) - this is 10 minutes later:
(This is with the dashboard disabled and preserve_job_records: false)
Any ideas how to troubleshoot this and figure out why so much ram is being used?
@rgaufman thanks for sharing this! I haven't done much memory profiling of GoodJob, though I'm surprised that GoodJob would perform significantly differently than Puma. Take the following as speculation (because I haven't looked)...
GoodJob's initialization process is pretty simple:
poll_interval
wakes up a thread and make an ActiveRecord query every N seconds.Because I haven't done any memory profiling, I can't give you anything specific about how to dig into the problem. I think trying to get some stackprof points into the initialization would be a good place to start, and look at allocated objects. It sounds like there isn't a problem though with object retention (e.g. memory usage increases over time), just that there is more than expected in the first place.
BTW, here's memory usage for my reference application.
-newsletters:2;newsletters:2
)Since upgrading our app from GoodJob 2.15.1 to 3.0.0 our Heroku worker memory is spiking.
Worker memory, past 2 hours:
Worker memory, past 24 hours:
Worker memory, past 72 hours:
The web dyno memory usage hasn't changed. Other dependencies were also upgraded (Zeitwerk, rack, ...), but the suspect is GoodJob as it's the worker memory which has increased and an inspection of the latest memory spikes shows a close overlap with the execution time of the latest jobs in the GoodJob dashboard:
The above table shows that jobs are infrequent. They either send some emails or make some API calls.
Here is the GoodJob initializer, for reference (in this release we also changed the syntax from GoodJob.X = Y
into Rails.application.configure do ... end
):
Any idea if/how GoodJob could cause this?
@sedubois yikes! That's not good.
I can't think of anything off the top of my head that would have blown out memory.
Looking through the Changelog, the major changes are:
Let me do another sweep through the code and see what I can find.
@sedubois I found a possible culprit in #652 which has been released in v3.0.1. Please give that a try and let me know if that fixes your problems.
@bensheldon yesterday I downgraded GoodJob from 3.0.0 to 2.15.1 and the memory issue had disappeared. Today I upgraded to 3.0.1 and there have also not been any memory spikes since. So it seems solved, thanks! 👍
@sedubois fantastic! 🎉
Hi there,
I have my good_job set up like this:
Here is a screenshot of it in action:
There are rarely more than 3-4 jobs running in parallel.
Compared to Sidekiq running the same jobs, it seems to take almost double the ram and looks like this in top:
Any ideas what could be responsible for this? - maybe I should disable the dashboard? - How would I troubleshoot and reduce ram usage?