rspec / rspec-support

Common code needed by the other RSpec gems. Not intended for direct use.
https://rspec.info
MIT License
98 stars 103 forks source link

Exclude internal methods in the caller filter #426

Closed bjfish closed 3 years ago

bjfish commented 4 years ago

This filters internal ruby implementation methods by filtering lines that include <internal.

benoittgt commented 4 years ago

Hello @bjfish

TruffleRuby can add more lines to the backtrace?

eregon commented 4 years ago

To clarify, <internal:... is the format used by CRuby too for methods implemented in Ruby, for instance on CRuby master:

ruby 3.0.0dev (2020-09-15T19:32:35Z master 9b535f3ff7) [x86_64-linux]
$ ruby -e 'tap { raise }'
-e:1:in `block in <main>': unhandled exception
    from <internal:kernel>:90:in `tap'
    from -e:1:in `<main>'

$ chruby truffleruby-dev
$ truffleruby -e 'tap { raise }'
-e:1:in `block in <main>':  (RuntimeError)
    from <internal:core> core/kernel.rb:534:in `tap'
    from -e:1:in `<main>'

$ chruby 2.6.6
$ ruby -e 'tap { raise }' |& cat
-e:1:in `block in <main>': unhandled exception
    from -e:1:in `tap' # repeated entry for C/Java core library methods
    from -e:1:in `<main>'
benoittgt commented 4 years ago

Thanks @eregon :)

eregon commented 4 years ago

TruffleRuby can add more lines to the backtrace?

Just to clarify: no it does not add extra lines. But when a core library method is defined in Ruby it will show up in the backtrace (instead of the next user backtrace entry being repeated for core methods in C/Java).

JonRowe commented 4 years ago

To clarify, I mean something that generates this stack trace to be filtered, not just passing arbitrary strings in.

eregon commented 4 years ago

FWIW there are 20+ integration tests failing on TruffleRuby without this (due to reporting not the first user code line but internals). Note <internal: is only used frequently in CRuby 3.0, not in 2.7 or before.

eregon commented 3 years ago

@JonRowe @benoittgt Could you re-review? It looks ready to merge to me.

benoittgt commented 3 years ago

Hello

Sorry for the delay. I tried this patch with truffleruby-head and it validates added specs. The last added specs is a little bit to broad on the Ruby version selection but it is a good "real world integration example" 🧙

For the CI on windows. This is unrelated, and I will try to fix it separately.

JonRowe commented 3 years ago

From the examples, this seems like information that might actually be relevant, and it doesn't even apply to released Rubys? So for now I'm closing, you should be able to remove these lines with the relevant external back trace filter.

eregon commented 3 years ago

@JonRowe So you are rejecting a PR that is doing the correct and standardized backtrace filtering (as in, filter non-user code entries) for Ruby 3 and TruffleRuby? (and Ruby 2.x too, see below) I'm shocked. Please reconsider.

Also, this happens on CRuby 2.7 which already defines some methods of the core library in Ruby files, see the *.rb files at the root of https://github.com/ruby/ruby/tree/ruby_2_7. However it seems hard to find a method which yields a block or so and would let us make a "real world integration example" and use RSpec::CallerFilter.first_non_rspec_line.

Actually there is one method taking a block, and it exists since Ruby 2.3 or even before:

$ ruby -ve 'Thread.exclusive { puts caller(0) }'
ruby 2.3.8p459 (2018-10-18 revision 65136) [x86_64-linux]
Thread.exclusive is deprecated, use Mutex
-e:1:in `<main>'
-e:1:in `block in <main>'
<internal:prelude>:15:in `block in exclusive'
<internal:prelude>:14:in `synchronize'
<internal:prelude>:14:in `exclusive'
-e:1:in `<main>'

$ ruby -ve 'Thread.exclusive { puts caller(0) }'
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
-e:1: warning: Thread.exclusive is deprecated, use Thread::Mutex
-e:1:in `block in <main>'
<internal:prelude>:12:in `synchronize'
<internal:prelude>:12:in `block in singleton class'
-e:1:in `<main>'

$ ruby -ve 'Thread.exclusive { puts caller(0) }'
ruby 3.0.0dev (2020-09-23T15:42:39Z use-blocking-callb.. d1d5f6d16f) [x86_64-linux]
-e:1:in `<main>': undefined method `exclusive' for Thread:Class (NoMethodError)

But as you can see that method is deprecated and then removed in Ruby 3 so it doesn't seem a good way to write a reliable test.

From the examples, this seems like information that might actually be relevant

It seems clearly not relevant for RSpec::CallerFilter.first_non_rspec_line which intends (AFAIK) to show the first "user code" location, not a location inside the core library, e.g. to show warnings.

Not merging this prevents running RSpec's own specs on TruffleRuby and causes the 20+ integration tests failures mentioned above.

For example, just bundle install && bundle exec rake spec on truffleruby-head gives:

...
  6) rspec warnings and deprecations #warning it should behave like warning helper sets the calling line
     Failure/Error: DEFAULT_FAILURE_NOTIFIER = lambda { |failure, _opts| raise failure }

       Kernel received :warn with unexpected arguments
         expected: (/\/home\/eregon\/code\/rspec-support\/spec\/rspec\/support\/warnings_spec.rb:45/)
              got: ("WARNING: Message. Called from <internal:core> core/hash.rb:223:in `fetch'.")
       Diff:
       @@ -1 +1 @@
       -[/\/home\/eregon\/code\/rspec-support\/spec\/rspec\/support\/warnings_spec.rb:45/]
       +["WARNING: Message. Called from <internal:core> core/hash.rb:223:in `fetch'."]

Clearly, the right location to report to the user is .../support/warnings_spec.rb:45, not Hash#fetch in the core library.

So, can you merge this?

JonRowe commented 3 years ago

The purpose of RSpec::CallerFilter.first_non_rspec_line is literally in the name, reject rspec lines, internal Ruby lines are valuable information to developers and its not up to us to exclude them by default.

eregon commented 3 years ago

What about the last example (for warnings) I showed? Nobody wants: WARNING: Message. Called from <internal:core> core/hash.rb:223:in 'fetch'. Everybody wants: WARNING: Message. Called from spec/rspec/support/warnings_spec.rb:45.

RSpec::CallerFilter.first_non_rspec_line is used for warnings, and that Called from <internal:core> core/hash.rb:223:in 'fetch' comes from: https://github.com/rspec/rspec-support/blob/e16dbca59c0a6f0a0562153e3b28d86e674fef55/lib/rspec/support/warnings.rb#L30

So one alternative I see would be to increase the skip there. But by how much? How to know if Hash#fetch calls the block via yield (which adds 1 entry) or via block.call (which adds 2 stack entries).

In this specific case though, we can probably just use call_site = options[:call_site] || CallerFilter.first_non_rspec_line to avoid deepening the stack.

Essentially to know if core library backtrace entries belong to RSpec or not we'd need to look what's the first non-internal entry above. It's possible, but it seems a lot of complications not needed for this case.

Is RSpec::CallerFilter.first_non_rspec_line used by anything except warnings (or other things which only want to show user code location)? If not, I think there is no point to make it complicated, we want to point to a user code location for warnings, not to a core library location.

eregon commented 3 years ago

With these changes:

diff --git a/lib/rspec/support/warnings.rb b/lib/rspec/support/warnings.rb
index 380150b..93d6979 100644
--- a/lib/rspec/support/warnings.rb
+++ b/lib/rspec/support/warnings.rb
@@ -27,7 +27,7 @@ module RSpec
       #
       # Used internally to print longer warnings
       def warn_with(message, options={})
-        call_site = options.fetch(:call_site) { CallerFilter.first_non_rspec_line }
+        call_site = options[:call_site] || CallerFilter.first_non_rspec_line
         message += " Use #{options[:replacement]} instead." if options[:replacement]
         message += " Called from #{call_site}." if call_site
         Support.warning_notifier.call message
diff --git a/spec/rspec/support/caller_filter_spec.rb b/spec/rspec/support/caller_filter_spec.rb
index 4c7004c..b7738a8 100644
--- a/spec/rspec/support/caller_filter_spec.rb
+++ b/spec/rspec/support/caller_filter_spec.rb
@@ -5,7 +5,7 @@ require 'rspec/support/caller_filter'
 module RSpec
   describe CallerFilter do
     it 'can receive skip_frames and increment arguments' do
-      expect(RSpec::CallerFilter.first_non_rspec_line(1, 5)).to include("#{__FILE__}:#{__LINE__}")
+      expect(RSpec::CallerFilter.first_non_rspec_line(3, 5)).to include("#{__FILE__}:#{__LINE__}")
     end

     it 'returns the immediate caller when called from a spec' do

we still have an extra failure for bundle exec rake spec on TruffleRuby:

  1) RSpec::CallerFilter the filtering regex does not match rubygems lines from `require` statements
     Failure/Error: DEFAULT_FAILURE_NOTIFIER = lambda { |failure, _opts| raise failure }
       expected result to have changed to include "/home/eregon/code/rspec-support/spec/rspec/support/caller_filter_spec.rb", but is now "<internal:core> core/kernel.rb:234:in `gem_original_require'"
     # ./spec/rspec/support/caller_filter_spec.rb:67:in `block (4 levels) in <module:RSpec>'
     # ./spec/rspec/support/caller_filter_spec.rb:50:in `in_rspec_support_lib'
     # ./spec/rspec/support/caller_filter_spec.rb:60:in `block (3 levels) in <module:RSpec>'

RubyGems' require is actually filtered as intended, but what we see here, "<internal:core> core/kernel.rb:234:in 'gem_original_require'" is actually the original require, that RubyGems aliases to gem_original_require. That original Kernel#require is implemented in Ruby in TruffleRuby.

So what should we do about this spec then? Filtering that internal require only seems not pretty. I think the real intention behind such filtering is to show user code locations (as in non-rspec, non-core-library, non-rubygems), and that filtering core library entries makes sense. But if you prefer something like the diff above, it can probably work too. Not sure how to address that spec about require though (and it seems relevant to end users).

eregon commented 3 years ago

Another way to see this: RSpec::CallerFilter.first_non_rspec_line returns a single line. A single core library location is basically always useless (not enough context): it does not contain information relevant for an RSpec user. So I think the better fix is to make RSpec::CallerFilter.first_non_rspec_line ignore core library lines, because they are never relevant for the end user.

The existing backtrace format on CRuby already does this: it just repeats the next user/non-core location for core methods written in C.

@JonRowe @benoittgt Thoughts?

eregon commented 3 years ago

@JonRowe

you should be able to remove these lines with the relevant external back trace filter.

I don't think there is any possibility of customization/external filter for RSpec::CallerFilter.first_non_rspec_line.

Filtering backtraces shown by failed specs is a different thing, that is untouched in this PR. I think maybe you misunderstood what this PR was about. It's about giving a useful location to the user for e.g. RSpec warnings.

JonRowe commented 3 years ago

If this proves to be a problem for Ruby 3 I'll revisit it later on, for now I suggest you improve the Truffle Ruby implementation to match MRI.

eregon commented 3 years ago

I suggest you improve the Truffle Ruby implementation to match MRI.

OK, this is what was misunderstood. <internal: is the standard format for core library methods defined in Ruby. It's not a TruffleRuby thing, it is the backtrace format from MRI. So asking to change that in TruffleRuby is unreasonable. This is not an incompatibility of TruffleRuby, it is a bug of RSpec we are fixing (i.e., it can report useless locations for warnings in some cases as shown above). This bug happens to trigger test failures (kudos on testing that, example output), so fixing it also makes those tests pass on TruffleRuby. The bug just happens more frequently on TruffleRuby because more core methods are defined in Ruby. I don't think you can blame TruffleRuby for that (and anyway MRI 2.7 is already doing it as well).

RSpec needs to handle these changes in the backtrace. It only works now by luck because MRI 2.7 does not have many core methods defined in Ruby yet. If Hash#fetch would be defined in Ruby on MRI, it would break exactly the same way as on TruffleRuby. Of course RSpec should not depend on that but just handle both cases gracefully. BTW the <internal: formats seems to date from 1.9 or older: https://github.com/sinatra/sinatra/blob/91477e6dad7c9d8eb1c358e47b9d83b73d626f84/lib/sinatra/base.rb#L1212 and has always been used for methods defined in prelude.rb such as Thread.exclusive.

That backtrace format is already what CRuby 2.7 uses for quite a few core methods, as said in https://github.com/rspec/rspec-support/pull/426#issuecomment-697950203. For instance:

$ ruby -ve 'TracePoint.new(0)' |& cat
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
<internal:trace_point>:96:in `new': no implicit conversion of Integer into Symbol (TypeError)
    from -e:1:in `<main>'

$ ruby -ve 'Thread.exclusive { puts caller }' |& cat
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
-e:1: warning: Thread.exclusive is deprecated, use Thread::Mutex
<internal:prelude>:12:in `synchronize'
<internal:prelude>:12:in `block in singleton class'
-e:1:in `<main>'

(note the <internal:trace_point>:96 and <internal:prelude>:12 entries)

CRuby 3 uses it for more core methods, as more of the core library methods are defined in Ruby.

I've worked for 6 years on TruffleRuby and I've always paid a lot of attention to have compatible backtraces with CRuby. In fact, TruffleRuby's backtraces are far more similar to MRI, than e.g. JRuby or Rubinius backtraces. Initially, TruffleRuby hid core library methods defined in Ruby, much like C methods on CRuby, by repeating the next non-core's location, for maximum compatibility. This leads to silly backtraces, and so once the standard of <internal: was clear, TruffleRuby started using it.

From this experience, I am convinced this PR is the correct solution.

If you disagree, then please show me an example where it is useful to have something like: WARNING: Message. Called from <internal:core> core/hash.rb:223:in 'fetch'. instead of: WARNING: Message. Called from spec/rspec/support/warnings_spec.rb:45. or WARNING: Message. Called from <internal:core> core/kernel.rb:234:in 'gem_original_require' instead of WARNING: Message. Called from /home/eregon/code/rspec-support/spec/rspec/support/caller_filter_spec.rb

BTW, Kernel#warn(uplevel: n) already does this filtering. Kernel#warn(uplevel: n) is Ruby 2.5+ so I guess RSpec cannot rely on that yet. There is similar filtering in recent RubyGems to skip the RubyGems require as that gives no useful info, similar to <internal: backtrace entries.

eregon commented 3 years ago

@JonRowe @benoittgt Maybe I should open a new PR since it seems the purpose of this PR and what it fixes was not clear?

eregon commented 3 years ago

FWIW RubyGems skips <internal: for Kernel#warn, which is a very similar situation: https://github.com/rubygems/rubygems/blob/ce27b98272d5c37ebf45b2b1b66894699ae47d58/lib/rubygems/core_ext/kernel_warn.rb#L42 (disclaimer: it's my commit)

benoittgt commented 3 years ago

@eregon, @JonRowe is the lead maintainer. He has final words on PR and issues. So I will probably not argue anymore, I already shared my approval. 🙂