Need advice on Solid Queue's memory usage

yjchieng commented 2 months ago

Ruby: 3.3.4 Rails: 7.2.1 Solid Queue: 0.7.0, 0.8.2

I run a Rails App on AWS EC2 instance with 1G of memory. I notice the solid queue process takes up 15-20% of the instance's memory, which becomes the single largest process by memory usage.

What I checked:

1) Check memory usage by start/stop supervisorctl (I use it to manage my solid queue process)

stop supervisorctl - free memory 276MB start supervisorctl - free memory 117MB

It increases 159MB

2) Stop supervisorctl service, and run "solid_queue:start" directly Trying to see if this is something related to supervisor.

before solid_queue:start - free memory 252MB after solid_queue:start - free memory 109MB

It increases 143MB

3) Then I notice there is a latest version. I upgraded to 0.8.2 (was 0.7.0).

stop supervisorctl - free memory 220MB start supervisorctl - free memory 38MB

It increases 182MB

I need some advise:

1) Is 150-200MB the minimum requirement to run "solid_queue:start"? 2) Is there any setting/feature that I can switch off to reduce memory usage? 3) Is there any setting that I can limit the maximum memory usage?

And, thanks a lot for making this wonderful gem. :)

rosa commented 2 months ago

Hey @yjchieng, thanks for opening this issue! 🙏 I think it depends a lot on your app. A brand new Rails app seems to use around 74.6MB memory for me after booting (without Solid Queue, just running Puma). I think the consumption you're seeing is from all the processes together and not just the supervisor, measuring free memory before starting the supervisor and after, as the supervisor will fork more processes. Are you running multiple workers or just one? I think reducing the number of workers there would help. Another thing that might help is using bin/jobs, which preloads the whole app before forking, but the gains there are usually quite modest.

rosa commented 2 months ago

There might also be something else going on because the only changes from version 0.7.0 to 0.8.2 were for the installing part of Solid Queue; nothing was changed besides the initial installation, so the memory footprint shouldn't have changed. I imagine there is other stuff going on in your AWS instance at the same time that might consuming memory as well.

Focus-me34 commented 3 weeks ago

Up 🆙🔥

I have huge memory issues in production (Rails 7.2 + activeJob + solidQueue). Everything works just fine in dev mode, but in production, there seems to be a memory leak. After restarting my production server, I get to roughly ~75% RAM usage. Very quickly (talking in minutes...) I get to ~100%. And if I let the app run for the weekend and come back on Monday (like today), I get to... 288% RAM usage... I tried removing all the lines in my code related to solidQueue, and I can confirm that this is what's causing the memory issue in production.

The exact error codes I'm getting, causing my app to crash in production (Heroku), are R14 and R15.

Any advice/suggestions would be very much appreciated fellow devs. Have an amazing day!

rosa commented 3 weeks ago

@Focus-me34, what version of Solid Queue are you running? And when you say you're removing anything related to Solid Queue, what Active Job adapter are you using instead?

Focus-me34 commented 3 weeks ago

@rosa I'm using Solid Queue version 1.0.0. I checked all sub-dependencies versions. They all respect the pre-requirements. We didn't really try with any other adapter since Solid Queue will be the default adapter in RoR 8. Hence, we really want to make it work this way.

Here's some of our setup code:

# scrape_rss_feed_job.rb
class ScrapingJob < ApplicationJob
  queue_as :default
  limits_concurrency to: 1, key: -> { "rss_feed_job" }, duration: 1.minute

  def perform
    Api::V1::EntriesController.fetch_latest_entries
  end
end

# recurring.yml
default: &default
  periodic_cleanup:
    class: ScrapeSecRssFeedJob
    schedule: every 2 minutes

development:
  <<: *default

test:
  <<: *default

production:
  <<: *default

# queue.yml
default: &default
  dispatchers:
    - polling_interval: 1
      batch_size: 500
      concurrency_maintenance_intervaL: 15
  workers:
    - queues: "*"
      threads: 3
      processes: <%= ENV.fetch("JOB_CONCURRENCY", 1) %>
      polling_interval: 0.1

development:
  <<: *default

test:
  <<: *default

production:
  <<: *default

Do you see anything weird?

rosa commented 3 weeks ago

No, that looks good to me re: configuration. You said:

I tried removing all the lines in my code related to solidQueue, and I can confirm that this is what's causing the memory issue in production.

So, if you don't get any memory issues when not using Solid Queue, is that because you're not running any jobs at all, if you're not using another adapter? Because that would point to the jobs having a memory leak, not Solid Queue.

Focus-me34 commented 2 weeks ago

Hey Rosa, sorry for the delayed reply! I've been very busy at work.

Here’s where we're at: I've been working on getting our company’s code running smoothly with Rails 7.2 and Solid Queue (in production Heroku). As I mentioned earlier, it’s been a huge challenge, and unfortunately, we haven’t had much success with it.

My colleague and I decided to take a closer look at our code to see if the problem was on our end. Since my last comment here, we’ve implemented tests, and I can confirm that the code is behaving exactly as expected.

Our next step, after troubleshooting the high memory usage on Heroku, was to try switching away from Solid Queue and try a different job adapter (as you suggested). I've set up Sidekiq as the adapter, and we saw a drastic improvement: memory usage dropped from around 170% of our 512 MB quota to a range of 25%-70%.

This leads me to believe that there might be a memory leak in production when using Solid Queue. From our observations, it seems that after the initial job execution completes, instance variables at the top of the function (which should reset to nil at the start of each job) are retaining the values from the previous iteration. We suspect this might be preventing the Garbage Collector from clearing memory properly between jobs.

Let me know if there's any more information I can provide to help you investigate. We’re really looking forward to moving back to using the built-in Solid Queue functionality once this issue is resolved.

[Edit: The job we're running involves two main dependencies. We scrape an RSS feed using Nokogiri and fetch a URL for each entry using httparty]

rails / solid_queue

Need advice on Solid Queue's memory usage #330