brandonhilkert / sucker_punch

Sucker Punch is a Ruby asynchronous processing library using concurrent-ruby, heavily influenced by Sidekiq and girl_friday.
MIT License
2.65k stars 114 forks source link

Memory leak and/or unfinished threads #84

Closed jlecour closed 9 years ago

jlecour commented 9 years ago

Hi Brandon,

I'm working with @colinux and @benitoDeLaCasita on a Rails 4.1 application. It is using sucker_punch and we love it. Thank you for making and maintaining it.

Recently we've noticed a huge increase in thread numbers on our production server. Our Rake and Unicorn processes have more and more threads (probably inactive).

I absolutely don't know how I can make this issue precise enough for you to tell if it's our fault or if there is something wrong with sucker_punch.

Do you think of any way sucker_punch could be used that would lead to unfinished threads?

Sometimes the Ruby code inside the jobs is failing (various reasons). Maybe the way we handle the exception is wrong. We use Airbrake :

SuckerPunch.exception_handler { |ex|
  Airbrake.notify({
    error_message: "#{ex.class.name}: #{ex.message}",
    error_class: ex.class.name,
    parameters: {celluloid_stack: Celluloid.stack_dump},
    backtrace: ex.backtrace
  })
}

Thanks for any help.

brandonhilkert commented 9 years ago

Is the Airbrake exception being caught and sent ok?

jlecour commented 9 years ago

Yes it is.

We'll try to isolate a job, with some expected (as in provoked) exceptions and count the number of created/finished threads.

brandonhilkert commented 9 years ago

Out of curiosity, how are you measuring the number of threads? Also wondering what the Airbrake.notify method does to alert. Perhaps it's creating threads too?

jlecour commented 9 years ago

Hi

We might have found the cause. It is not confirmed yet (will be tomorrow), but it ha nothing to do with Airbrake.

We are measuring the number of threads with Munin graphs and htop.

The cause might be the dynamic creation of job classes for each request. We'll change that and verify.

Thanks for your will to help

brandonhilkert commented 9 years ago

Were you able to conclude anything?

jlecour commented 9 years ago

Yes. There is nothing wrong with SuckerPunch here. When we stopped creating job classes on the fly, the three pool stabilized.

Thanks for your help.