Combination of option hash and keyword arguments causes crashes

dorner commented 3 years ago

Subject of the issue

When verified doubles are turned on, a particular combination of method arguments causes RSpec to crash - it gets confused with the method signature.

Your environment

Ruby version: 3.0
rspec-mocks version: 3.10.2

Steps to reproduce

Run the following file:

require 'bundler/inline'

gemfile do
  source 'https://rubygems.org'
  gem 'rspec'
end

require "rspec/autorun"

RSpec.configure do |config|
  config.mock_with :rspec do |mocks|
    mocks.verify_partial_doubles = true
  end
end

module MyModule
  def self.my_func(arg1, arg2, options={}, kw: false)
  end
end

RSpec.describe MyModule do
  it 'should pass' do
    allow(MyModule).to receive(:my_func).and_return('hi mom')
    MyModule.my_func(1, 2, { foo: 'bar', baz: 'eggs' })
  end
end

Expected behavior

The spec should pass.

Actual behavior

Fails with the following error:

     Failure/Error: MyModule.my_func(1, 2, { foo: 'bar', baz: 'eggs' })

     ArgumentError:
       Invalid keyword arguments provided: foo, baz

Note that if you replace the call by explicitly calling the keyword argument and not relying on the default, the spec passes: MyModule.my_func(1, 2, { foo: 'bar', baz: 'eggs' }, kw: false)

pirj commented 2 years ago

Thanks for reporting, and apologies that it took long to process.

Apparently, there are two issues. One is in MethodSignatureVerifier#has_kw_args_in?, and it seems to be an easy fix:

      def has_kw_args_in?(args)
        Hash === args.last &&
          could_contain_kw_args?(args) &&
-         (args.last.empty? || args.last.keys.any? { |x| x.is_a?(Symbol) })
+         (args.last.empty? || args.last.keys.any? { |x| x.is_a?(Symbol) }) &&
+         (args.last.keys & @allowed_kw_args).any?
      end

Except that it's not correct.

When the last argument is passed in curly braces, it should not be treated as kwargs.

> def my_func(x, y, options={}, kw: false)
>   puts options, kw
> end
> my_func(1,2, foo: 1, kw: 1) # ArgumentError: unknown keyword: :foo
> my_func(1,2, {foo: 1, kw: 1})
{:foo=>1}
false

We really have a way to distinguish between the two invocation options with Hash.ruby2_keywords_hash?.

So, the fix would be to employ it in has_kw_args_in?:

      def has_kw_args_in?(args)
        Hash === args.last &&
          Hash.ruby2_keywords_hash?(args.last) &&
          could_contain_kw_args?(args) &&
          (args.last.empty? || args.last.keys.any? { |x| x.is_a?(Symbol) })
      end

But this is not sufficient.

my_func(1, 2, 'foo' => 1, :kw => true) # ArgumentError: unknown keyword: "foo"

even though Hash.ruby2_keywords_hash?(args.last) returns true. 🤯

And this statement (from https://github.com/rspec/rspec-support/pull/366):

        # If the last argument is Hash, Ruby will treat only symbol keys as keyword arguments
        # the rest will be grouped in another Hash and passed as positional argument.

doesn't seem to be correct, at least in Ruby 3.0. And this means that the logic of splitting symbol and non-symbol arguments in split_args method is wrong.

Keep in mind, Hash.ruby2_keywords_hash? appeared in Ruby 2.7, and we still support earlier Rubies.

Would you like to dive deeper in this and send a PR?

dorner commented 2 years ago

I can try to give it a shot but I haven't spent a lot of time in the guts of RSpec and I find it pretty confusing. :) I'll have time in a couple of weeks as things slow down at work, but if anyone is more familiar with this part of the code we may be able to get a faster / more useful fix earlier.

pirj commented 2 years ago

Thanks! My impressions is the fix won't have to touch more than those two methods. Consider that class a unit, and you don't have to think how it plays with the rest of RSpec. If you get stuck with something, we'll be happy to provide guidance.

dorner commented 2 years ago

Thanks very much for the direction! I will try to get to this in a week or two!

dorner commented 2 years ago

@pirj I've started looking into it and I'm not sure we can use the ruby2_keywords_hash? method here. That would only work if the method in question has had the ruby2_keywords method applied to it manually. I don't think we can try to apply it after the fact when looking at a method signature of an arbitrary method.

Adding those fixes above makes this test case work, but only because AFAICT Hash.ruby2_keywords_hash? always returns false.

Unfortunately I don't think there is a good generic way of doing this other than actually checking the Ruby version as well as the ruby2_keywords status of the hash, because Ruby 2.7 and 3.0 just behave completely differently by default.

dorner commented 2 years ago

Maybe this is something we'd need to introduce to the RubyFeatures class and check it in the method_signature_verifier?

pirj commented 2 years ago

we can use the ruby2_keywords_hash? method here. That would only work if the method in question has had the ruby2_keywords method applied to it manually

rspec-mocks do have ruby2_keywords. I remember that it's sufficient to have this declaration on an entry pointю

checking the Ruby version as well as the ruby2_keywords status of the hash

If there is no other option, that's fine.

Ruby 2.7 and 3.0 just behave completely differently by default

In which cases?

Hash.ruby2_keywords_hash? always returns false

even though Hash.ruby2_keywords_hash?(args.last) returns true

I can't recall what the exact case was, but I debugged and for 'foo' => 1 it returned true. Could this have been true?

introduce to the RubyFeatures class and check it in the method_signature_verifier?

Let's make it work first, we can shuffle things around as a final step.

dorner commented 2 years ago

Hmm... the case that's failing is here: https://github.com/rspec/rspec-support/blob/main/spec/rspec/support/method_signature_verifier_spec.rb#L667

According to this, when passing :x => 1 (without the surrounding brackets), Hash.ruby2_keywords_hash?(args.last) returns false.

dorner commented 2 years ago

The cases where it behaves differently are where you have an optional hash as the last argument as well as keyword hashes. If you pass a hash into that method:

Ruby 3 default: Treat the hash as the last optional positional argument. Ruby 2 default: Treat the hash as keywords. Ruby 3 with ruby2_keywords: Treat the hash as the positional argument.

if I'm understanding this correctly. Also, ruby2_keywords is only defined on Ruby >= 2.7 so if we're supporting anything under that we'd need to shim it somehow.

dorner commented 2 years ago

Could it be that whatever process in rspec-mocks calls that ruby2_keywords method isn't getting called specifically in the rspec-support tests?

pirj commented 2 years ago

passing :x => 1 (without the surrounding brackets), Hash.ruby2_keywords_hash?(args.last) returns false.

Can this be because ~def valid_non_kw_args?(arity)~ def valid?(*args) is defined without ruby2_keywords? 🤔

PS edit

dorner commented 2 years ago

I'm not sure. It seems like as long as some method is defined with ruby2_keywords it flags the hash with a special flag that should get passed around. And I'm not sure if that special method actually exists in the guts of whatever RSpec is doing to run these specific tests.

pirj commented 2 years ago

The logic for accepting a hash as an argument or kwargs differs not only between Ruby 2 and Ruby 3, but what's unfortunate it's also different in Ruby 2.7, see https://eregon.me/blog/2019/11/10/the-delegation-challenge-of-ruby27.html It looks like we have to account for that in has_kw_args_in? and split_args to make corner cases like you reported work correctly.

dorner commented 2 years ago

Looks like this would have to turn into a matrix test for all 3 versions of Ruby (< 2.7, 2.7, >= 3.0) and we'd have to try all combinations. Fun! 😄

dorner commented 2 years ago

OK, I've been fighting with this for a while and I'm not sure this is actually solvable as written right now. 😢

The main change with Ruby 3.0 is that it differentiates between passing a hash and passing keyword arguments to a method with variable or optional keyword parameters. So

def my_method(**kwargs); end
my_method(k: 1) # fine
my_method({k: 1}) # crashes

The problem is that we're trying to validate the method signature based on the arguments that are passed in, to a method (valid?) that only defines a single parameter, *args. This forces any keyword inputs to be turned into a hash, no matter what version of Ruby you're running. And once it's been turned into a hash, you can no longer tell if it was originally passed in as keywords or not. So essentially we can't actually use this method to validate keyword arguments at all in Ruby 3.0.

In order to make this actually work, valid? would need to change to take both *args and **kwargs - and MethodSignatureVerifier#initialize would also need to change to accept keyword arguments, and whatever code creates that verifier in RSpec would need to change as well.

So this has turned from a simple fix into a fairly significant rewrite ☹️

@pirj thoughts?

JonRowe commented 2 years ago

In order to make this actually work, valid? would need to change to take both *args and **kwargs - and MethodSignatureVerifier#initialize would also need to change to accept keyword arguments, and whatever code creates that verifier in RSpec would need to change as well.

We did do this conditionally in other places and it didn't work as it had slightly different semantics, the Ruby core team told us that the way to do it was ruby2_keywords and co

pirj commented 2 years ago

There's no such a thing as a simple fix when delegation and kwargs are involved, and multiple Ruby version support is a requirement 😄

dorner commented 2 years ago

@JonRowe the problem is that ruby2_keywords can really only be used if we "own" both methods. If a method is defined without ruby2_keywords, we can't turn it into one that has it enabled and try to validate it, because the behavior would be different.

What would your recommendation be here?

dorner commented 2 years ago

@JonRowe bump?

pirj commented 2 years ago

the problem is that ruby2_keywords can really only be used if we "own" both methods

But we're talking about valid? that we own, right?

need to change to take both *args and **kwargs, [otherwise, ] it's been turned into a hash, you can no longer tell if it was originally passed in as keywords or not.

def ours(*args)
  theirs(*args)
end

# not under our control
def theirs(a: 0)
  a
end

puts ours(a: 1)

On Ruby 3.1, it will fail. But if you add ruby2_keywords:

ruby2_keywords def ours(*args)

it would work just fine and will output 1.

dorner commented 2 years ago

Sorry, it's been a while since I looked at this. But I think the problem is that we are trying to inspect the theirs method to try and figure out if it takes keyword arguments or not. We can call it fine without crashing, but we're trying to decide what kind of arguments it's supposed to take, and in that case it depends on the original method being defined (or not) with ruby2_keywords.

pirj commented 2 years ago

Returning to the original problem with self.my_func. It doesn't define delegation or has splat args, does it? Let's try to fix this case without breaking others. I suggest you to send a PR with a broken spec, and let's see how we could fix it, and other cases where currently argument splitting is incorrectly implemented.

dorner commented 2 years ago

I can definitely do that. But that was the original approach I took, and which I couldn't figure out. I think rewriting the valid? method might be our only real chance here.

dorner commented 2 years ago

@pirj Added the test case.

carlad commented 1 year ago

Any update to this?

malcolmohare commented 8 months ago

Proposing this as a fix, https://github.com/rspec/rspec-support/pull/594

rspec / rspec-support