thoughtbot / factory_bot

A library for setting up Ruby objects as test data.
https://thoughtbot.com
MIT License
7.9k stars 2.6k forks source link

Memory Leak in FactoryGirl::Declaration::Implicit ? #542

Closed tehprofessor closed 10 years ago

tehprofessor commented 11 years ago

Howdy,

Recently, we (my team) were attempting to deploy a migration where we had to modify existing records of our database by adding an association if it was empty (e.g. User's have profiles, but existing user's did not so we wanted to add them).

User.find_each(&:add_profile)

Mysteriously, the migration would always fail after consuming all available RAM on the machine. We could see it growing in nearly perfectly linear fashion in New Relic.

factory-girl-in-gemfile

Completely confused by why this was going on, I decided to take data points from the ObjectSpace over time... So every 500 users, I'd binding.pry and take the data points. After about 30,000 users, I decided it was time to graph 'em and see if I could find something... Low and behold:

objectspace

I was rather shocked since none of the code even touched FactoryGirl, I grepped our entire app/ directory and found nothing. We use FactoryGirl to seed from time to time in production, and for testing... With this migration we aren't seeding; so, I decided I should try commenting it out from our Gemfile and re-run the migration.

sans-factory-girl-in-gemfile

Fixed... well from my perspective

Things to note:

I did attempt to force GC at various intervals of user's processed (100, 500, and even 1) with something like:

i = 0
User.find_each do |u|
  u.add_profile
  GC.start if (i % 100) == 0
end

I am using:

Ruby 2.0.0-p0 
factory_girl 4.2.0
factory_girl_rails 4.2.1
rails 3.2.13
Deiban 7.0 (Wheezy)

I did look at FactoryGirl::Declaration::Implicit but nothing stood out to me as a reason why this would be going on. So I figured I'd at least let you all know; maybe you'll have some clues? Also, if you all have no clues, don't worry-- it may be something specific to our app that's causing it.

Best, Seve

joshuaclayton commented 11 years ago

@tehprofessor Thank you for this awesome writeup. Can you clone production data locally, run the task, and see the memory leak there? If so, I have a few questions for you:

  1. How many factory files do you have? Is it just one, or do you have the files broken up?
  2. If you only have one factory file, can you do a couple things for me? First, could you add a ::Kernel.p 'processing factories!' at the top inside the FactoryGirl.define block and run the code? I'd like to see if the factory files are getting loaded multiple times.
  3. If you comment out the FactoryGirl.define block(s) in the factory files, do you still see the leak?

I think that'd be a good start to help track this down. Thanks!

joshuaclayton commented 10 years ago

@tehprofessor any updates?

tehprofessor commented 10 years ago

Howdy!

My apologies for the "AFK." I'm currently swamped with work (it's midnight on a Saturday, and here I am replying!).

Regardless... I think this is a case of correlation vs causation, that is to say, the growth of factory girl appears to be a symptom of the root problem (doh!). I attempted a bit further investigation of the issue, but was unable to reproduce/isolate it outside of our stack (though, my time and resources were limited).

Other deploys of ours, even after removing every trace of factory girl, have been problematic where there are migrations done via activerecord (i.e. a rails model/ruby code). When we perform them with SQL, it's fine, and no problems arise during or after.

If I get more time in the future, I'll see if I can investigate this issue further, but for now-- I'm going to close this... At this point, I think it's safe to say, FactoryGirl is at most a symptom and not the root.

Thank you for your time! I'm sorry for having wasted it! My econometrics professors taught me better when it comes to interpreting data :astonished:

Thank you for all your work with the gem and for taking the time to respond.

Best, Seve

joshuaclayton commented 10 years ago

@tehprofessor thanks for getting back to me; it's a shame since this sounds like a pretty interesting issue! If you run into it again or you have any more insight, please feel free to open the issue back up and shoot me a message so we can revisit it!

jleven commented 10 years ago

I ran into a similar problem, and wanted to share what I learned in case it turns out to be useful for others.

I wrote a bit of code to track the set of allocated objects before and after a block, and when running it on a very simple block, was shocked to see a hundred or so FactoryGirl::Declaration::Implicit objects being created:

ObjectSpace.track_allocations_while { "hi" }
==> {...,
     "FactoryGirl::Declaration::Implicit"=>112,
     ...}

It turns out, the offending code was the code I was using to count allocated objects:

counts = Hash.new { 0 }
self.each_object do |o|
  counts[o.class.name] += 1
end

o.class.name was returning "FactoryGirl::Declaration::Implicit" but was actually a FactoryGirl::DefinitionProxy object, and #class was invoking the #method_missing logic, which was invoking:

@definition.declare_attribute(Declaration::Implicit.new(name, @definition, @ignore))

Going into the definition for FactoryGirl::DefinitionProxy and adding class to the list of methods to not be undef'ed, prevented this problem entirely.

module FactoryGirl
  class DefinitionProxy
    UNPROXIED_METHODS = %w(__send__ __id__ nil? send object_id extend instance_eval initialize block_given? raise caller method class)

My specs all seem to pass with this change, so it appears class was left out of this list for completeness sake and not because the FactoryGirl library expects it to be delegated to the declaration. I'd recommend that class be added to this list in the next release of FactoryGirl, if only because several StackOverflow answers suggested using code very similar to what I am (o.class) and others will likely continue to be confused by this.

Thanks!

daniel-rikowski commented 9 years ago

@jleven Thank you for sharing this. I had a similar problem with thousands of leaked FactoryGirl::Declaration::Implicit instances in a delayed_job worker. Adding class to the list effectively fixed that.

I also didn't notice any problem using FactoryGirl, so I agree this should really be added to the offical code, even if it was not the root cause of @tehprofessor's problem. (or potentially even mine!)

At the very least it would put FactoryGirl off the radar when people investigate their memory leaks :innocent: