rspec / rspec-mocks

RSpec's 'test double' framework, with support for stubbing and mocking
https://rspec.info
MIT License
1.16k stars 358 forks source link

Mocked singleton methods are bound to the mocked class and thus don't respect inheritance #1452

Open jejacks0n opened 2 years ago

jejacks0n commented 2 years ago

Subject of the issue

The method variable that's provided to the block in and_wrap_original is bound to the mocked class but should instead be bound to self, which won't be the same as the mocked class when using anything that has inherited from our mocked class.

Your environment

Steps to reproduce

Alright, I'll try to distill this down to the simplest example. Let's start with a fairly basic inheritance structure like the following:

class Base
  def self.foo
    "#{self.name}.foo" # intentional use of `self` to highlight the issue
  end
end

class A < Base
end

class B < A
end

So we have a really simple example of how A and B don't implement a given class method and so the implementation in Base is used. We can look at that in an irb session:

A.foo # => "A.foo"
B.foo # => "B.foo"

Base.method(:foo) # => #<Method: Base.foo() (irb):2>
A.method(:foo) # => #<Method: A.foo() (irb):2>
B.method(:foo) # => #<Method: B.foo() (irb):2>

All of the .foo methods reference the same source location -- irb line 2, which is where it was defined when I pasted it into the console. More importantly, we can see that a simplified but reasonable model, is to imagine that calling A.foo and B.foo is handled by traversing the inheritance hierarchy until Base.foo is found, where self will reference the class that we're traversing from. This is important for the rest of the discussion.

What I'm trying to surface here is that self is critically important for both our .new and .foo methods to be properly handled on the A and B classes. Right?

Regardless of what class we call .foo from, self should reference that class. If we call B.foo, we obviously expect self inside of any calls up the inheritance hierarchy to be B, and here is where I believe the issue is. So let's highlight that in an example:

begin
  require "bundler/inline"
rescue LoadError => e
  $stderr.puts "Bundler version 1.10 or later is required. Please update your Bundler"
  raise e
end

gemfile(true) do
  source "https://rubygems.org"

  gem "rspec", "3.10.0" # Activate the gem and version you are reporting the issue against.
end

puts "Ruby version is: #{RUBY_VERSION}"
require "rspec/autorun"

class Base
  def self.foo
    "#{self.name}.foo" # intentional use of `self` to highlight the issue
  end
end

class A < Base
end

class B < A
end

RSpec.describe "mocking with inheritance concerns" do
  it "shows that mocking A.foo breaks the use of `self` inside the Base.foo implementation" do
    allow(A).to receive(:foo).and_wrap_original { |method| "#{method.call}(mocked)" }

    expect(Base.foo).to eq "Base.foo"
    expect(A.foo).to eq "A.foo(mocked)"
    expect(B.foo).to eq "B.foo(mocked)" # this fails but likely shouldn't
  end
end

Let's focus in on that last line, and then consider that there seems to be no way to work around this. The method variable, which is all we have to work with, is bound to A and so any reference to self all the way up the inheritance hierarchy for this method will always be A regardless of what self refers to inside of any other class method inside of B. This starts to get really strange, right? When you think on it that way, you might see the number of complex issues this can cause.

I've had a couple hours to dig into the rspec-mocks library and have worked out what the issue is, and it can be described by this snippet, where we don't see the problem. The difference is that we actively bind the original method to the correct object each time our replacement method is called. When we do this, the unbound method can be bound to each object correctly, and not simply on A.

RSpec.describe "mocking with inheritance concerns" do
  it "should bind our original unbound method to the correct object each time" do
    original_method = A.method(:foo).unbind # get the method and unbind it
    A.define_singleton_method(:foo) do # redefine the method with our own
      method = original_method.bind(self) # !!! don't bind to A, instead, bind to self
      "#{method.call}(mocked)" # `method` is now bound to the right object
    end

    expect(Base.foo).to eq "Base.foo"
    expect(A.foo).to eq "A.foo(mocked)"
    expect(B.foo).to eq "B.foo(mocked)"
  end
end

Expected behavior

After mocking A.foo, we expect that when calling B.foo, we would still have a correct reference to self inside our Base.foo method.

Actual behavior

After mocking A.foo, we see that when calling B.foo, self inside our Base.foo method refers to A.

Another way to put this, is that when we define_singleton_method on A, we're also redefining the method that will be called when we call B.foo, and so self is still really important in regards to what we bind to. We can't simply bind the original method to A, we have to bind it to self, which then respects the classes that inherit from A.


As a final thought, it's the same core issue that caused me to open https://github.com/rspec/rspec-mocks/issues/1451, which I think was prematurely closed, because when we get into it, Object.new is exactly like our Base.foo example, but in C. In Ruby we can imagine that as being implemented as:

def self.new(*args, &block)
  self.allocate.tap { |i| i.send(:initialize, *args, &block) } # again, a redundant use of `self`
end

We can then see that if self is broken anywhere along our inheritance hierarchy, as it is when we take a method and bind it to the wrong object, we'll get strange things, like instances of A when we call B.new.

jejacks0n commented 2 years ago

The issue seems to stem from the fact that MethodDouble#proxy_method_invoked leaves _obj ignored.

This would likely need to be passed along to our MessageExpectation so it can be used in conjunction with and_yield_receiver_to_implementation at worst, or at best it should be handled in rspec-mocks where we bind the original method to that "receiver" so it becomes a non concern for most users.

This raises some behavioral considerations though. If we mock A.foo and then call B.foo, should A.foo be considered called if the original .foo wasn't defined on A?

We have the option here -- to assert that we're really messaging A.foo if we A.foo() and can disregard any call to B.foo, or we say that B.foo should count as a call to A.foo, and it depends on how you mentally model how the .foo method works in our inheritance hierarchy.

If we think of it as duplicating the Base.foo method on both A and B, then we shouldn't consider a call to B.foo as a call to A.foo, but if we think of calling B.foo as traversing to A.foo and then Base.foo then it could count as a call.

jejacks0n commented 2 years ago

We can see a simple implementation of a working example by monkeypatching a couple things.

require 'rspec/mocks'

module RSpec
  module Mocks
    @__method_double_receivers = {}

    class MethodDouble
      def proxy_method_invoked(receiver, *args, &block)
        (RSpec::Mocks.instance_variable_get(:@__method_double_receivers)[original_method.to_s] ||= []).push(receiver)
        @proxy.message_received method_name, *args, &block
      end
    end

    class AndWrapOriginalImplementation
      def call(*args, &block)
        receiver = (RSpec::Mocks.instance_variable_get(:@__method_double_receivers)[@method.to_s] ||= []).pop
        @block.call(@method.unbind.bind(receiver), *args, &block)
      end
    end
  end
end

Our previously failing spec now passes. Obviously this isn't the solution, and just tests the hypothesis. The implementation would need to take into account which behavioral model that makes the most sense to implement.

pirj commented 2 years ago

Thanks for reporting, reproduction example, in-depth analysis and a patch. Would you like to send a PR to see if works with the current RSpec test suite?

jejacks0n commented 2 years ago

Thanks @pirj, it's not a patch by any means so no? I'm providing that as an example of how I personally tested the hypothesis, and so anyone else can as well -- since it makes mocking things like .new work as one would expect.

I would also imagine someone with a deeper understanding of the library should probably think on, if, and how it should be implemented, and based on my previous issue being closed before being understood, I'd need some level of consensus before personally spending time on it.

Diving into the source though, it's obvious that these are largely understood concerns. There's a missing spec, which is really the first one I provide, and where we should start. Let's assume we add that spec and then consider, how do we want to make that pass?

RSpec.describe "mocking with inheritance concerns" do
  it "allows mocking a class method without breaking the use of `self`" do
    allow(A).to receive(:foo).and_wrap_original { |method| "#{method.call}(mocked)" }

    expect(Base.foo).to eq "Base.foo"
    expect(A.foo).to eq "A.foo(mocked)"
    expect(B.foo).to eq "B.foo(mocked)" # this fails
  end
end

What should the last expectation in the spec be?

# Should it be considered a call _through_ A.foo?
expect(B.foo).to eq "B.foo(mocked)"
# OR should calling B.foo behave like a call to Base.foo directly?
expect(B.foo).to eq "B.foo"
# OR should mocking A simply break the reference to `self` in subclasses?
expect(B.foo).to eq "A.foo(mocked)"
pirj commented 2 years ago

Your spec reflects how I would expect it to work intuitively. @JonRowe may have a different vision that I'll gladly accept.

JonRowe commented 2 years ago

Thank you for this issue, its much clearer here what the problem is (the behaviour you think is incorrect in #1451 is a symptom of this, but was correctly closed as the issue was how you configured the response) so I'm happy to review any PRs you wish to open and help where I can.

jejacks0n commented 2 years ago

I appreciate your time @JonRowe, @pirj.

Sorry if my phrasing isn't always the best, but I want to be clear that the original issue isn't a symptom of how the response was configured. It's an issue of the concept I outline with self and the fact that a mocked method will be bound only to the object we're mocking it on.

I try to clarify this in the final thought -- where we can glean our understanding by a Ruby equivalent of the C implementation of .new:

class Base
  def self.new(*args, &block)
    instance = self.allocate # `self` will be incorrect here
    instance.send(:initialize, *args, &block)
    instance
  end
end

class A < Base
end

class B < Base
end

In this Ruby implementation of .new, self will exhibit the same issue that we highlight around the .foo method. This is the core of both -- it's not an issue of the return value of the block provided to and_wrap_original because we simply can't return the thing we'd need to, because we don't know what it is.

Which is the second thing I try to convey -- a user can't resolve it given the tools that they have available. If the "true" receiver of our call to .foo (or .new) came through in the block when we chain and_yield_receiver_to_implementation it could be resolved by simply binding it manually. Take this example for instance:

allow(A).to receive(:foo).and_wrap_original do |method, receiver|
  "#{method.unbind.bind(receiver).call}(mocked)"
end.and_yield_receiver_to_implementation

That would work if the receiver was passed through as B, but it doesn't currently work because receiver will always be A. The simple solution is to adjust the receiver that's given to the block in that scenario so it can be resolved manually. Given that already exists though, I imagine doing so might break somebody's test suite somewhere, which is where I'll leverage your guidance.

The likely easy to merge option is around the and_yield_receiver_to_implementation change, and I believe that's the intent of and_yield_receiver_to_implementation because what's being mocked is known while mocking -- I wouldn't really ever need it in my block.

How would you both feel about that change as an initial implementation to consider/document?