Resetting `ActiveSupport::CurrentAttributes` can be inconsistent due to hook ordering

pond commented 3 months ago

What Ruby, Rails and RSpec versions are you using?

Ruby version: 3.3.3 Rails version: 7.1 RSpec version: 6.1.3 (vs working, 6.1.2)

Observed behaviour

The reset behaviour occurs in between the suite's config.around :each (e.g. in spec_helper.rb) and any context/describe block's before :each ... (or more generally, before examples thereafter run). This breaks any tests which had current attributes set up in the config, upon which tests subsequently rely. This bug is the reason for https://github.com/ErwinM/acts_as_tenant/issues/338.

Expected behaviour

We strongly suggest that the reset should happen either after all other "each"-style example hooks have concluded, or before any "each"-style example hooks start. It should never happen invisibly in between "each"-style hooks.

The inclusion of ActiveSupport::CurrentAttributes::TestHelper within RSpec::Rails::RailsExampleGroup is not IMHO all that wise, because it can only work if the implementation therein has a particular defined way of doing something that's directly compatible with any other callback chains and callback ordering in the wider scope of tests. That's simply not the case, as we can see. The ActiveSupport implementation seems to be over-zealous, resetting both before and after examples run (from RSpec's perspective):

  def before_setup
    ActiveSupport::CurrentAttributes.reset_all
    super
  end

  def after_teardown
    super
    ActiveSupport::CurrentAttributes.reset_all
  end

...and it seems to me that _just the after_teardown callback_ is all you actually need to achieve the functionality that #2752 desired, without breaking existing tests; you could simply do that directly inside RailsExampleGroup in place of include ActiveSupport::CurrentAttributes::TestHelper. This would get around a lot of issues I suspect, but it is clearly still not perfect - you are not controlling callback order here - there's still an edge case chance that someone's spec_helper.rb might have its own after-example code in a config.around :each or config.after :each which expects to do things with whatever is expected to be still inside CurrentAttributes, other than just resetting. That's why we recommend making sure somehow that this action is either done first in the callback chain (or at least before any "user defined" callbacks run), or last, after any "user defined" callbacks, per example.

Can you provide an example reproduction?

Yes. A tiny stripped down almost-Rails application with README.md containing additional information and a replication test case is included.

rspecbug.tar.gz

pirj commented 3 months ago

Have you checked if the solution to clean only in after_teardown works?

pond commented 3 months ago

@pirj I hadn't, given the code analysis seemed so clear-cut, but I just hacked that into the "live" gem via bundle open:

# ...
      if ::Rails::VERSION::MAJOR >= 7
        include RSpec::Rails::TaggedLoggingAdapter
        def after_teardown
          super
          ActiveSupport::CurrentAttributes.reset_all
        end
        include ActiveSupport::ExecutionContext::TestHelper
      end
# ...

(...which is ugly as sin and I'm not suggesting that code formatting style - comments would help, too!) and then:

$ bundle exec gem list rspec-rails

*** LOCAL GEMS ***

rspec-rails (6.1.3)

$ bundle exec rspec -f d                   

Reset of ActiveSupport::CurrentAttributes
==> RSpec: config.around :each
++> Class: Setting 'Current.canary' to "spec_helper.rb \"config.around\""
--> Tests: before :each
++> Class: 'Current' WAS RESET
  works

Finished in 0.00118 seconds (files took 0.87574 seconds to load)
1 example, 0 failures

...so that looks pretty successful, but do note the caveats that this doesn't really fix the problem, it just pushes it into a more unlikely edge case.

pirj commented 3 months ago

Will it push the problem even further if we use append_after?

It’s not the first time we have hook order issues.

pond commented 3 months ago

@pirj

Sorry for delayed response - timezones; i'm in New Zealand.

Callback ordering can indeed be a royal pain! You need to put something on the tail of the around-each callback chain - since around-each is the "outermost" (according to this reference). In pseudocode, what we need to achieve is akin to:

append_to_tail_around :each do | example |
  example.run()
  ActiveSupport::CurrentAttributes.reset_all
end

RSpec Core should never be depending on anything to do with Rails' current attributes, and any "user" code that tries to append after RSpec Core & RSpec Rails's own callback chains is just asking for trouble anyway. Going to the end of the after-each callback chain is probably not sufficient because the around-each hooks run after those, which is why I think you need to append to the tail of around-each and that should be sufficient. You need to be the "outermost Russian Doll". Beware, though, I'm very much not an expert in the implementations of the RSpec family and, depending on how its internal callback chains actually map to what the API exposes as around-vs-after-hooks, append_after might be fine.

pond commented 3 months ago

NB - If you wish I could try to work up a PR for this. The thing that I hesitate over is testing - I've no idea how to "prove" something breaks before, and works after this fix, given I'd need to do a config-level before-each then an example level before-each and can't figure out how to do emulate that within RSpec Rails's own test suite.

pirj commented 3 months ago

Right. So it’s a matter of the right loading of RSpec.config blocks that define config.around hooks. But we config.include specific example groups, and some of them define hooks](https://github.com/rspec/rspec-rails/blob/8c17b4e5020a4d264e8a79e294c58b5c1ef2b005/lib/rspec/rails/example/system_example_group.rb#L173). An around defined in such way would run before an around defined directly with config.around like it is in acts_as_tenant, so this won’t help.

You can check the ‘features’ directory, there are some examples on how to define config-level hooks.

JonRowe commented 3 months ago

A couple of things here, firstly rspec-rails is a way to bring test helpers and behaviour from Rails across into RSpec, this means that we try to adopt the same behaviours as testing apps in Rails own test helpers [which are minitest based], so this behaviour is an unfortunate side effect rather than serious bug in my opinion.

We implement minitest's life cycle like this:

    module MinitestLifecycleAdapter
      extend ActiveSupport::Concern

      included do |group|
        group.before { after_setup }
        group.after  { before_teardown }

        group.around do |example|
          before_setup
          example.run
          after_teardown
        end

This is because of our own precedence around hooks, around happen first because they can skip the example and don't have access to the instance, and before / after happen in the context of the hooks. See the documentation for around for warnings: https://rspec.info/features/3-13/rspec-core/hooks/around-hooks/

WARNING: around hooks do not share state with the example the way before and after hooks do. This means that you cannot share instance variables between around hooks and examples.

WARNING: Mock frameworks are set up and torn down within the context of running the example. You cannot interact with them directly in around hooks.

WARNING: around hooks will execute before any before hooks, and after any after hooks regardless of the context they were defined in.

So the problem is actually using an around hook to set state, really it should be before/ after to respect both our and Rails semantics... [this behaviour would also exhibit in Rails if they had the same concept as around hooks]

pond commented 3 months ago

@JonRowe I'm sorry but that's just not a good answer.

You're talking about RSpec Rails invisibly resetting some state in between hooks and doing this before a test runs. If you at least only did it after, that might not be so bad.

Telling people they're "holding it wrong" because we don't like the fact they trip over a bug the code inadvertently introduced on a patch version and then trying to make up rules about when hooks should or should not be used - which contradicts existing documentation no less [1] - is not an answer and helps nobody. As a community of developers working together on the wonderful tool that is RSpec Rails, I am sure that we can do better.

[1] "This lets you define code that should be executed before and after the example" - for example, we might want to set login state in CurrentAttributes before all examples, then reset CurrentAttributes after all examples, and because RSpec Rails didn't in 6.1.2 or all previous versions do this, every single person ever who wanted this would according to the documentation use an "around" hook. Then, a patch version has arisen which breaks this and changes the rules. That's totally against semver; if you're going to make a breaking change like that, you need to call it RSpec Rails 7 and I'd be 100% supportive.

What we have here is a bug that's breaking people's tests, is trivially easy to fix, and you have someone offering to do that work for you. Surely, the answer is "yes please".

Failing that:

This is a serious breaking change to how hooks worked in all RSpec Rails versions prior for the whole time this gem has existed, and Ruby gems use semver, so call it RSpec Rails 7.
Or just revert that PR so we don't reset current attributes, and call it 6.1.4 with this condition documented as a caveat, since again, that's how RSpec Rails has worked for all time previously.

pirj commented 3 months ago

One important thing to note, which is partially highlighted in those warnings, too is:

the syntax of around is similar to that of before and after but the semantics are quite different. before and after hooks are run in the context of the examples with which they are associated, whereas around hooks are actually responsible for running the examples

This suggests that

  config.around(:skip_multitenancy) do |example|
    ActsAsTenant.without_tenant do
      example.run

runs current_tenant too early.

It is understandable that it's totally unexpected for the ActsAsTenant.unscoped? to be reset along with all Current attributes set in config.around.

My suggestion is to explore how config.around is implemented, and if we don't have a ready-to-use surround_around, try to hack it with using hooks.register directly, or attempt to mimick how define_built_in_hooks and replace this include which happen to run in scope of the example itself with reset_all that would run outside of without_tenant.

@pond since we have plenty of options here, I strongly suggest you to explore them. We can distribute the blame retrospectively after we fix the problem. A PR is welcome, and we'll be happy to provide any support on your way to fixing it.

pond commented 3 months ago

It's really a question of whether or not we obey semver. The simple unfortunate fact is: a patch version broke working client code that's not doing anything the docs prohibit.

So, we can introduce a breaking change in a patch version and say "forget semver" - which I'd rather strongly object to, NPM shows us the hellscape that we enter if we do that! - or we can come up with a way to resolve the breaking changes. It won't just be our test suite or anyone using ActsAsTenant having issues here and it took us a long time to figure out what was wrong since, of course, you absolutely do not suspect going from 6.1.2 to 6.1.3 of just about anything to have caused the issue, and the manifestation was so bizarre - one minute in our test suite something is set, the next minute it's just gone, and this was invisible to our code.

Very hard to debug.

The short-term fix here would honestly be to keep things simple - remove this change and don't reset current attributes. Then the community can discuss how it might want to approach that, either by changing docs, version number strategies, etc. etc. and we can move forward without RubyGems holding a landmine that people might step on.

In the mean time I can produce a PR, but Jon has been pretty clear that he wants things to just call existing Rails code so I'm loathe to spend time on this unless he's open to the idea of seeing what the code looks like. The gem's his baby after all.

pirj commented 3 months ago

just call existing Rails code

"just" is an exaggeration. If you glance over adapters or fixtures, you'll see it's not always that simple.

So please proceed with the PR with no doubts.

pond commented 3 months ago

So please proceed with the PR with no doubts.

OK, no worries! I'll get onto it (and do the best I can to keep within the spirit of Jon's concerns).

pond commented 3 months ago

@JonRowe / @pirj - I've implemented https://github.com/rspec/rspec-rails/pull/2774 which tries to keep within the spirit of implementation Jon describes and provides a compromise solution that should work for pretty much everyone. Details in the PR. Feedback very welcome! If the implementation isn't controversial, I can imagine that the test code might be...

Should I be updating the topmost 'development' section of Changelog.md in PRs too, or is that done by whoever merges stuff?

pirj commented 1 month ago

This seems related, but the related PR stalled.

rspec / rspec-rails