rspec / rspec-mocks

RSpec's 'test double' framework, with support for stubbing and mocking
https://rspec.info
MIT License
1.16k stars 356 forks source link

Using `receive(:new).and_wrap_original` on a base class breaks instantiation of anything that inherits from that base class. #1451

Closed jejacks0n closed 2 years ago

jejacks0n commented 2 years ago

Subject of the issue

It appears that receive(:new).and_wrap_original on a base class breaks instantiation of anything that inherits from that base class.

Your environment

Steps to reproduce

Here's a spec that exhibits the behavior pretty clearly. It specifies how one might expect Ruby to behave when calling Dog.new, and fails because the instance returned is not a Dog instance.

class Animal
  def initialize
    puts 'initializing animal'
  end
end

class Dog < Animal
  def initialize
    puts 'initializing dog'
    super
  end
end

RSpec.describe ":new with and_wrap_original" do
  it "appears to ignore subclasses" do
    allow(Animal).to receive(:new).and_wrap_original do |method|
      puts 'stubbed logic'

      method.call # this will return our Animal instance
    end

    # This fails because an Animal instance is returned, not a Dog instance.
    # Dog#initialize is also never called.
    expect(Dog.new).to be_a(Dog)
  end
end

Expected behavior

One would expect Dog.new to return a Dog instance under all circumstances.

Actual behavior

Calling Dog.new returns an Animal instance and Dog#initialize is never called.

pirj commented 2 years ago

It's not specifically and_wrap_original, and_return behaves in the same way.

pirj commented 2 years ago

My quick guess is that we use base class's singleton class to override the stubbed method.

class A
end

class B < A
end

A.define_singleton_method(:new) { 'A string' }

B.new # => "A string"
jejacks0n commented 2 years ago

Ah, that would make sense. Is this the expected behavior, and if so, is it clarified in documentation or anything?

I think this might surface as a more common pattern now that *_any_instance_of logic has been documented as unmaintained. At least, that's how I ended up coming across it.

I've resolved it in my situation by introducing an anonymous "nothing" subclass that simply inherits from the base class. We then never instantiate the base class directly, but it's all a bit of a gotcha, for sure.

class Animal
  def initialize
    puts 'initializing animal'
  end
end

class AnonymousAnimal < Animal
end

Now because we're never instantiating our base class directly, we can allow(AnonymousAnimal).to receive(:new).and_wrap_original { } without causing issues on other instances that inherit from Animal.

I wonder what others have learned in this space. Thanks for your comments and attention on it @pirj.

pirj commented 2 years ago

Is this the expected behavior

I wouldn't say so. Wondering if there's a straightforward way to work this around. Do you have an idea?

now that *_any_instance_of logic has been documented as unmaintained

It's not unmaintained to my best recollection, but is only recommended for use with legacy code.

introducing an anonymous "nothing" subclass

Very inventive workaround.

Out of curiosity, can you share a bit more context on the purpose of wrapping the initializer of the base class? I don't mean in any way that your testing approach is incorrect, just wondering how widespread this could be. A pretty common example that also suffer from the same problem:

regular_user = instance_double('User')
allow(User).to receive(:new).and_return(regular_user)
# ...
admin = AdminUser.new # :boom:, regular user is returned instead
jejacks0n commented 2 years ago

I'd say I'm on an edge use case, so likely isn't an example worth sharing?

The places I saw it coming up are exactly how you described it. With models I'd see it being an issue with STI, as well, I've used presenters in generating a response from a controller where one presenter inherited from a useable base class -- again, often an Author < User kind of scenario.

I doubt it's a BIG use case, but I was sure surprised by the behavior after I understood it. I'd say that there might be some dragons hidden under mocking :new, so documenting the potential risks there might be useful?

I'll think on it, but the best idea I can come up with off hand, is a warning when appropriate.

You're attempting to mock :new on a class with descendants, and this will cause any new instance of those descendants to be replaced by an instance of the class you're mocking.

Let me know how you'd like for me to help.

pirj commented 2 years ago

I'd suggest prepend Module.new as an alternative:

class A
  def self.a
    'a'
  end
end
class B < A
  def self.a
    'b'
  end
end
A.singleton_class.prepend Module.new { define_method(:a) { 'A string' } }
A.singleton_class.prepend Module.new { define_method(:new) { 'A new' } }

puts A.a # A string
puts B.a # b
puts A.new # A new
puts B.new # A new

However:

  1. It doesn't work with new
  2. To my best knowledge, it's impossible to unstab just as easy as delete_method does now
  3. prepend didn't exist in earier Rubies that we still support in RSpec 3, so this would have to wait anyway

Speaking of the warning. Sounds good in general. Wondering how it would be possible to mute the warning for I don't care about subclasses/I don't have control over this code cases? It will take a walk down the ObjectSpace to find out if the class has descendents. Wondering if this comes with a performance impact. Probably only for codebases that stub new often.

JonRowe commented 2 years ago

I'm pretty sure this is a deliberate behaviour, consider this scenario instead:

class Animal
  def self.expensive_sound_generator
    # expensive wav generator routine
  end
end

class Dog < Animal
  def self.expensive_sound_generator
    # ...
    super
  end
end

RSpec.describe do
  it "works with subclasses" do
    allow(Animal).to receive(:expensive_sound_generator).and_wrap_original do |method|
      "soundfile.wav"
    end

    expect(Dog.expensive_sound_generator).to eq "soundfile.wav"
  end
end

That would be the correct behaviour, and is much more obvious as to what is going on, but is the same as your spec just with a different method name.

As you're stubbing new here which is called by super on Dog, you're skipping the normal allocation process that would return a Dog instance, you could possibly use method to find the class called and have a case statement, but I'm a bit fuzzy on what the owner class is at this point.

I'm going to close this as "working as expected" but I'm open to a PR improving the documentation around this, I'm not so sure a warning is a good idea because it would affect legitimate usage of this behaviour.

pirj commented 2 years ago

Just a note that it works differently with non-new class methods:

class Animal
  def self.expensive_sound_generator
    # expensive wav generator routine
  end
end

class Dog < Animal
  def self.expensive_sound_generator
    # ...
    super
    "DOG.wav" # Override!
  end
end

RSpec.describe do
  it "works with subclasses" do
    allow(Animal).to receive(:expensive_sound_generator).and_return("soundfile.wav")

    expect(Dog.expensive_sound_generator).to eq "soundfile.wav"
  end
end
     Failure/Error: expect(Dog.expensive_sound_generator).to eq "soundfile.wav"

       expected: "soundfile.wav"
            got: "DOG.wav"
JonRowe commented 2 years ago

@pirj super returns "soundfile.wav" its the super call thats important.

The default implementation for Dog.new is (paraphrased):

def self.new
  super
end

Which then calls Animal.new and that returns the original implementation of calling Animal.new in the and_wrap_original call.

The difference is that new is normally a magical method in that it essentially should call allocate.initialise and return an instance of the subclass, but the mock overrides that.

jejacks0n commented 2 years ago

Yeah, that's just all I could come up with off hand. I don't love my suggestion of the warning message, because it surfaces its own complexity that would be worth avoiding.

The best solution would be one that makes it a non-concern.

EDIT: I'm cleaning up this comment to simplify the discussion and code examples into the ones that are really important.

jejacks0n commented 2 years ago

I kinda understand what you're saying @JonRowe, but the way it's implemented, there's not a way to handle what anyone would likely want on :new. That's my point.

At the time that we can provide the block to wrap, we don't have enough information to do what any sensible person would want to do, and I'm trying to see if that's an oversight or if it should be added to the implementation of mocking for :new.

jejacks0n commented 2 years ago

TL;DR: The method variable that's provided to the block in and_wrap_original is bound to the mocked class but should instead be bound to self, which won't be the same as the mocked class when using anything that has subclassed our mocked class.

Alright, I'll try to distill this down to what seems incorrect. Let's start with a fairly basic inheritance structure like the following:

class Base < Object # intentionally redundant inheritance of `Object`
  def self.foo
    "#{self.name}.foo" # intentionally redundant use of `self` to highlight the issue
  end
end

class A < Base
end

class B < A
end

Really simple example of how A and B don't implement a given class method and so the implementation in Base is used.

A.foo # => "A.foo"
B.foo # => "B.foo"

Base.method(:foo) # => #<Method: Base.foo() (irb):2>
A.method(:foo) # => #<Method: A.foo() (irb):2>
B.method(:foo) # => #<Method: B.foo() (irb):2>

Looking at where those methods are defined, they're all the same -- irb line 2, which is where they were defined when I pasted it into the console. More importantly, we see that a simplified but reasonable model, is to imagine that calling A.foo and B.foo is handled by traversing the inheritance hierarchy until Base.foo is found, where self will reference the instance that we're traversing from. This is important for the rest of the discussion.

What I'm trying to surface here is that self is critically important for both our .new and .foo methods to be properly handled on the A and B classes. Right?

Regardless of what class we call .foo from, self should reference that class. If we call B.foo, we obviously expect self inside of any calls up the inheritance hierarchy to be B, and here is where I believe the issue is. So let's highlight that in an example:

RSpec.describe "mocking with inheritance concerns" do
  it "shows that mocking A.foo breaks the use of `self` inside the Base.foo implementation" do
    allow(A).to receive(:foo).and_wrap_original { |method| "#{method.call}(mocked)" }

    expect(Base.foo).to eq 'Base.foo'
    expect(A.foo).to eq 'A.foo(mocked)'
    expect(B.foo).to eq 'B.foo(mocked)' # this fails, but likely shouldn't
  end
end

Focus in on that last line, and then consider that there seems to be no way to work around this. The method variable, which is all we have to work with, is bound to A and so any reference to self all the way up the inheritance hierarchy for this method will always be A regardless of what self refers to inside of any other class method inside of B. This starts to get really strange, right? When you think on it that way, you might see the number of issues this could cause.

I've had a couple hours to dig into the rspec-mocks library and have worked out what the issue is, and it can be described by this snippet, where we don't see the problem. The difference is that we actively bind the original method to the correct object each time it's called, which is self, and not simply A.

RSpec.describe "mocking with inheritance concerns" do
  it "could be done better" do
    original_method = A.method(:foo).unbind # get the method and unbind it
    A.define_singleton_method(:foo) do # redefine the method with our own
      method = original_method.bind(self) # !!! don't bind to A, instead, bind to self
      "#{method.call}(mocked)" # `method` is now bound to the right object
    end

    expect(Base.foo).to eq 'Base.foo'
    expect(A.foo).to eq 'A.foo(mocked)'
    expect(B.foo).to eq 'B.foo(mocked)'
  end
end

Another way to put this, is that when we define_singleton_method on A, we're also redefining the method that will be called when we call B.foo, and so self is still really important in regards to what we bind to. We can't simply bind the original method to A, we have to bind it to self, which then respects the classes that inherit from A.

And when we get into it, Object.new is exactly like our Base.foo example, but in C. In Ruby we can imagine that as being implemented as:

def self.new(*args, &block)
  self.allocate.tap { |i| i.send(:initialize, *args, &block) } # again, a redundant use of `self`
end

We can then see that if self is broken anywhere along our inheritance hierarchy, as it is when we take a method and bind it to the wrong object, we'll get strange things, like instances of A when we call B.new.

jejacks0n commented 2 years ago

Ultimately, what happens on https://github.com/rspec/rspec-support/blob/main/lib/rspec/support.rb#L52 (and the other definition) is part of what I think we're seeing. It binds the method to the object in that moment, and stores that. B will then inadvertently call that method that's bound to A, and where self will incorrectly return A instead of B.

I can see through comments in the code that this is a thing that's come up, I just don't think it's "right" yet.