collectiveidea / delayed_job

Database based asynchronous priority queue system -- Extracted from Shopify
http://groups.google.com/group/delayed_job
MIT License
4.82k stars 954 forks source link

After some time delayed jobs process is taking all memory of the development machine after deployment. #823

Closed vitulicny closed 3 months ago

vitulicny commented 9 years ago

Deployment via capistrano.

using: delayed_job (4.0.6) capistrano3-delayed-job (1.4.0) delayed_job_active_record (4.0.3)

server: Ubuntu 12.04.5 LTS (GNU/Linux 3.2.0-24-virtual x86_64)

Any idea what we could check or how to debug this issue.

Everything is fine on stage and production with production rails env.

albus522 commented 9 years ago

Probably something getting lost in code reloading. My guess would be that your web server does the same but you restart it more often.

glaszig commented 9 years ago

same here. two systems: staging and production. memory on staging is being filled up by delayed_job over time.

production

staging

i'm at a loss as to where to look. the only significant difference are ubuntu release and ram size and i'd be damned if one of these are the reason rbenv compiles ruby in a way it introduces memory leaks.

suggestions anyone?

glaszig commented 9 years ago

update: i tamed delayed_job. enabling rails' class cache prevents the leakage as in #776.

# config/environments/staging.rb
config.cache_classes = true
grexican commented 8 years ago

for what it's worth, config.cache_classes was my issue, too. Setting that to true solved my problems.

dgobaud commented 8 years ago

I'm still seeing this problem and I have cache_classes set to true

ruby '2.2.2' gem 'rails', '4.2.5' delayed_job (4.0.6) delayed_job_active_record (4.0.3)

It's just a steady march up...

image

image

albus522 commented 8 years ago

There are many memory things ruby does not clear up well. If you have any jobs that involve a lot of data, ruby will naturally grow and not due to DJ. You would see the same thing if the jobs were run inline in the server. There is no easy answer but you can search around for ways to identify what type of ruby objects are building up over time.

dgobaud commented 8 years ago

I don't think my jobs consume a lot of data (or use a lot of memory I guess is what you mean?) but I have a lot of jobs... I don't think I had this problem until the number of jobs became high.

I have at least 2,400 jobs that run every 15 minutes.

albus522 commented 8 years ago

That in and of itself will generate a lot of objects. How ruby handles the cleanup is out of our hands and that graph is very common to many/most ruby apps. The most common "memory leak" is references to objects that survive the normal work loop, whether that be a single job, a web request, or other. There are ways to get ruby to tell you what type of objects are stacking up. Search for object count reports.

krtschmr commented 8 years ago

i have the same issues. but around 160.000 jobs per hour (!). each job is doing http request to scrape data. how can i avoid this?

i have not find any other ideas. which one is best?

glaszig commented 8 years ago

kill sidekiq

this is about delayed_job. or so i thought.

1c7 commented 7 years ago

Same here. My delayed_job also take up all the memory in about 1 day. Cause server down. not even able to ssh login to system, I have to manually restart server on Azure.

ENV

Server: Ubuntu 14.04
Ruby 2.3.1
gem 'delayed_job', "~> 4.1.1"
gem 'delayed_job_active_record', "~> 4.1.0"
krtschmr commented 7 years ago

my problem was, that i actually started 500k+ jobs per day. if there is just 10 byte uncleared memory, well, that eats up

Am 09.12.2016 um 19:31 schrieb 1c7:

Same here. also my delayed_job would take up all the memory after about 1 day. and cause server down. not even able to ssh login to system, I have to manually restart server on Azure.

ENV

Server: Ubuntu 14.04 Ruby 2.3.1 gem 'delayed_job', "> 4.1.1" gem 'delayed_job_active_record', "> 4.1.0"

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/collectiveidea/delayed_job/issues/823#issuecomment-266003204, or mute the thread https://github.com/notifications/unsubscribe-auth/AMv1un3SsdfCoRDpRs72uidR7Uj7DCCXks5rGUomgaJpZM4FDMNr.


Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus

gregblass commented 7 years ago

@dgobaud What are you using there to monitor your memory usage?

gregblass commented 7 years ago

I think I may be experiencing this too. I've got two workers running delayed jobs on my production server and after about a day of low/moderate usage, I run out of memory. 1GB EC2 instance. Then my capistrano deploys fail on asset precompile.

krtschmr commented 7 years ago

@gregblass nothing you can do about. i actually restart all my processed via cronjob every 6 hour to prevent bloating

gregblass commented 7 years ago

So I don't have a lot of processes running. Maybe 20-30 a day I'd think. But in theory isn't this not OK? How can processes be spawned and then memory just lost into the void?

I added a swap of 1GB and it fixed my Capistrano issues.

But in the long run, if delayed job is going to slowly eat way my server's memory, I will avoid it and use something that doesn't. Does Sidekiq have this issue?

gregblass commented 7 years ago

Oh nevermind, you're talking about 500K+ processes per day! Thats crazy. Congrats to you on whatever you're working on. There's no way I'm experiencing the same types of effects as you. I think it may be the 2 workers I'm spawning and the fact that I have only 1GB of real memory allocated, plus a ton of JS/CSS to precompile?

krtschmr commented 7 years ago

observing 1Million instagram accounts requires 1million jobs fetching the data.

no big congrazt on that.actually i am using sidekiq and that problem occured

Am 26.04.2017 um 10:27 schrieb Greg Blass:

Oh nevermind, you're talking about 500K+ processes per day! Thats crazy. Congrats to you on whatever you're working on. There's no way I'm experiencing the same types of effects as you. I think it may be the 2 workers I'm spawning and the fact that I have only 1GB of real memory allocated, plus a ton of JS/CSS to precompile?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/collectiveidea/delayed_job/issues/823#issuecomment-297227361, or mute the thread https://github.com/notifications/unsubscribe-auth/AMv1us8j_lbfeIsq0x0o7SY_SdcOLkB1ks5rzrm8gaJpZM4FDMNr.


Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. https://www.avast.com/antivirus

djdarkbeat commented 5 years ago

Taking a moment to share how I resolved this:

Set config/environments/development.rb to use this - config.cache_classes = ENV['CACHE_CLASSES'] || false

Then we ran delayed job with - CACHE_CLASSES=true bundle exec rake jobs:work

Clearly if you are developing a job you want to run backgrounded you will need to just omit the ENV, then it will use the reloading..albeit with the memory creep. This solution is helpful if you just need to run it a long time in dev, and keeps your laptop from blowing up if you leave it all running in tmux or something.

ghiculescu commented 3 years ago

cache_classes = false solve the issue but isn't very satisfying. We dug into it a bit more and ended up with this solution: https://github.com/collectiveidea/delayed_job/pull/1115#issuecomment-874284946

Now reloading only runs when files are changed - the problem before was that your whole app was reloaded every 5 seconds.

AhMohsen46 commented 3 years ago

@ghiculescu I’m sorry I got confused, should cache_classes be set to true or false in the whole discussion above?

Second, the plugin you created, should it be only loading the files needed by the worker instead of loading the whole env?

should one limit cache_classes=false to the worker, since it’s gonna handle the needed files by the plugin, and set cache_classes=true everywhere else?

thanks

ghiculescu commented 3 years ago

ooops, I meant cache_classes = true solves the issue but isn't very satisfying. It's not satisfying because it makes the memory leak issue go away, but your code won't reload while in development mode, so if you change a file you'd need to restart your job worker.

https://github.com/collectiveidea/delayed_job/pull/1115#issuecomment-874284946 reloads the entire app, but only does so when:

1) a job is about to be run, and 2) you've changed files in your autoload paths (typically your app dir) since the last time you ran a job

In other words it behaves exactly the same as your Rails server.

You should always set cache_classes = false in development so you get code reloading. In production, you want cache_classes = true; with that the plugin will do nothing (which is fine - you don't want code reloading in production)

bubbaspaarx commented 3 years ago

I am having a related issue but need some help. I have two laptops running the same code and on one, even if i run 'rake jobs:clear', the ruby process opens up and never closes. On the other laptop, the same code opens up the ruby process but it closes within a few seconds. I can't seem to figure out why 2 v similar laptops (macbook pro's) with the same code base be acting completely differently. Any help is greatly appreciated.

chiperific commented 2 years ago

I had a stair-stepping memory leak in Heroku running Delayed Job and my app was daily hitting the swap memory line. Heroku worker memory w: DelayedJob

I really didn't want to change config.cache_classes to false so it was recommended I switch to Sidekiq and see if the issue remained.

Heroku worker memory w: Sidekiq

I think you can pinpoint the moment when my Sidekiq code was deployed.

I just swapped in Sidekiq in place of DelayedJob, I didn't make any other changes. I've really enjoyed using DelayedJob over the last 7 years, but I think these charts say it all.

As a disclaimer, I'm a pretty junior dev, so it's definitely possible other solutions might have fixed my issue and allowed me to keep using DelayedJob, but it's hard to argue for DelayedJob at this point.

albus522 commented 3 months ago

Closing stale