How do I detect whether the full suite ran or just a subset?

JoshCheek commented 5 years ago

Subject of the issue

I am recording data from the tests that run:

screen shot 2019-02-04 at 2 05 56 pm

Then I use it to generate examples in the README.

52233851-c60ccd80-2885-11e9-86ea-e842c1ebc456

If a subset of tests run (eg because I'm filtering to a tag or only running one or two files), then it won't have seen the requests come through for some of the data. In that situation, it won't be able to render the README b/c it will ask to show data from an example that wasn't run.

I don't want to try and generate it and then abort if it asks for unavailable examples, because that could mask a spelling error or a missing test.

So I'm deciding whether to generate the README based on whether all the tests ran or whether they were filtered to a subset. The code I use to do that is kinda shitty:

It worked for the examples I threw at it, but I'm not totally sure it means what I think it means
It violates encapsulation, digging into the guts of RSpec.configuration to directly access an ivar.

RSpec.configure do |config|
  # Generate the readme.
  config.after :suite do
    # We only want to run this code if all the relevant examples were recorded
    # So check that it was run against the root spec dir
    to_run = RSpec.configuration.instance_variable_get(:@files_or_directories_to_run)
    raise "Failed sanity check, has the private API changed?" unless to_run
    next unless to_run.include? "spec"

    # ...
  end
  # ...
end

However, I looked around for an hour or so and haven't been able to find a better way to do this. I would expect numerous tools to find this feature useful. Eg code coverage tools could choose to not run when the suite is filtered (because untested lines may turn out to be tested when the full suite runs).

So, if there's a way to do this, I'd appreciate a pointer. And if not, then I propose it's worth considering as a feature :)

Your environment

$ ruby -vne $'puts $\' if (/^GEM/../^PLATFORMS/) && /rspec/ && /^    \\b/' < Gemfile.lock
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-darwin17]
rspec-core (3.7.1)
rspec-expectations (3.7.0)
rspec-mocks (3.7.0)
rspec-rails (3.7.2)
rspec-support (3.7.1)
spring-commands-rspec (1.0.4)

JonRowe commented 5 years ago

:wave: There are a few problems here.

The foremost being that we don't actually know if we've run the full suite, as there is no guarantee all spec files live in one folder, or that we've been given all the files.

Even if we're running the default folder, we might be filtering within that folder. If you focus one of your tests now, or use tags etc, your code would accidentally run here.

So solutions? I can see two ways to provide something close to what you want here.

We could expose wether the filters have had any effect? I.e based on the information given are we excluding any specs? This would still probably be hard for when being given folders etc, though, our only check would be to say is this folder the same as the configured base? It's still not 100% "right".

You could parse and check the example persistence file to see if the example count matches the ones previously. We take care of merging the previous run and doing a best effort combination with the current status, we could potentially say if this is a full run based off that but its potentially still error prone...

I kind of feel for your purposes parsing that file would give you an example count to compare to the ones run... Equally, you could adopt a similar strategy whereby old runs are combined together before building your docs.

JoshCheek commented 5 years ago

Great response, thank you! And yeah, what you said about not really knowing makes sense.

I got it to work by doing what you suggested and reading the persistence file. I'll put it here in case anyone else finds this and wants to know how I did it:

RSpec.configure do |config|
  config.example_status_persistence_file_path = "tmp/rspec_examples.txt"

  # Generate the readme. Run in at_exit so that it prints after RSpec's summary
  config.example_status_persistence_file_path = "tmp/rspec_examples.txt"
  config.after :suite do
    at_exit do
      next unless config.example_status_persistence_file_path
      current  = config.world.all_examples.map(&:id)
      previous = RSpec::Core::ExampleStatusPersister
                   .load_from(config.example_status_persistence_file_path)
                   .map { |e| e[:example_id] }
      if (previous - current).empty?
        # We inject the auth token because the real one changes every time you
        # authenticate, which means the generated files will change every time you
        # run the test suite, which spams the diff with irrelevant random changes.
        auth_token = "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE1NDkzNzcyODksImlkIjozMDl9.aatooRSNEDy-gkcTjSX7ZtRb1mIPGayn6pMYpz0iJVw"
        context    = Support::RenderReadme.new Support::RenderReadme.saved, auth_token: auth_token
        template   = Rails.root/'README.erb.md'
        readme     = Rails.root/'README.md'
        readme.write ERB.new(template.read).result(context.template_binding)
        message    = "Readme generated\n\n"
      else
        message    = "Readme skipped because some tests weren't run\n\n"
      end
      message = "\e[A" + message.gsub(/^(.*?)(\n|$)/, "\e[34m\\1\e[39m\\2") if $stderr.tty?
      $stderr.puts message
    end
  end

  # ...
end

JonRowe commented 5 years ago

@JoshCheek I'm happy to help refine this a bit more if we can help make an easier API for this.

Your implementation above has an edge case here if you add or remove a spec, it'll take until the next spec run to pick them up.

Have you also checked that its not a false positive of running all the time? I'm unsure when at_exit will run compared to our own persistence code, it might be checking the same runs.

JoshCheek commented 5 years ago

Hey, I wrote all this up, so I guess I'll leave it to document what I thought through. But really, the last 2 paragraphs are the relevant ones. Basically, I'm doing it wrong 😝

Super appreciate your feedback and willingness to think through it with me!

Your implementation above has an edge case here if you add or remove a spec, it'll take until the next spec run to pick them up.

Have you also checked that its not a false positive of running all the time? I'm unsure when at_exit will run compared to our own persistence code, it might be checking the same runs.

So it seems to update the index before my at_exit hook runs. It also merges its results into that file (vs replacing the file with the results of the last run). So a file does need to be run for its tests to show up in there, which does mean that a filtered first run will mistakenly think it is the full suite.

After the full run, though, it seems to pick up deleted files right away and seems to detect removed tests when it runs that file next.

This is probably adequate, it feels +90% correct. If a test is removed in a file I ran, it will be removed from the index, everything is correct. If it's in another file and the file is removed, the same thing will happen (based on an experiment, but I haven't verified this rigorously). If it's in a file and that file didn't run, then it will be in the index still, but that's okay because any other tests in that file won't have run either (the edge case is only if there are no other tests in that file, but it seems very edge and should be fixed on next full suite run).

If a test is added to a file I run, the index will be correct.
If a test is added to a file I don't run, or a new file is added and not run, then the index will be incorrect. But I can't think of a reason I'd add a test and not run it, and it'd have to also be necessary for README generation. Yeah, in this case, I'd have ran it, I don't think we can arrive in this state in reality.
If a test is removed from a file I run, the index will be correct.
If a test is removed from a file I don't run, the index will be incorrect, it will not generate the readme, but unless that was the last test in the file, it wouldn't have generated the README, even if the index was correct. And if it was the last test in the file, I'd have deleted the file. So it's hard to imagine this causing incorrect behaviour in reality, as well.

So, other than a filtered first test, I think all other ways it can be incorrect are negligibly likely to occur in reality.

@JoshCheek I'm happy to help refine this a bit more if we can help make an easier API for this.

I'll admit I had to sit down and think through a mental matrix after reading your comment. And it does feel a bit fragile. If it really can't be accurately determined, then I wouldn't want to move prediction into RSpec. However maybe making it easier to access this diff would be good. That would protect my code from changes in how that file is handled / when that file is handled.

I started trying to think through it, imagining receiving a list of the union of examples that were in the file and that were run (with a key for old and new values, if new is nil, its deleted, if old is nil, its added) but it very quickly spread from there and then I'm realizing I should maybe be listening to a notification or emitting a notification, and then I wanted to add a new notification and update the formatter...

Yeah, it jumped the shark pretty quick, TBH. Made me step back a bit and I realized the problem was that I'd coupled too many things together. What I should really do is maintain my own index of recorded data. After the tests run, I can update that index. Then, I should make a separate piece of code to generate the README from the index. If I want to, I can hook that into the default rake task or a git hook or maybe set an env var to tell the process to generate it. That seems much simpler, much more flexible, and doesn't require RSpec to add facilitating features.

JonRowe commented 5 years ago

I might have been sneakily hoping you would use it as inspiration for a feature powered externally ;)

What I would do is record a similar index of snippets; so when you encounter them you build a list of them, and update those examples as you run them, then using a similar merging strategy to what we do you'd get the whole set eventually and can always upload / diff it.

JoshCheek commented 5 years ago

I might have been sneakily hoping you would use it as inspiration for a feature powered externally ;)

Ahh, lol. Did you have a specific feature in mind, or was it more goal oriented (eg exposing the persistence data or file to users, or detecting full suite vs partial suite runs).

What I would do is record a similar index of snippets; so when you encounter them you build a list of them, and update those examples as you run them, then using a similar merging strategy to what we do you'd get the whole set eventually and can always upload / diff it.

That's really congruent with what was in my brain, I'll check out your merging strategy, I saw it was there but haven't looked at how it works yet.

JonRowe commented 5 years ago

I just think this is something that is better off being externally managed, albiet in a similar fashion to us, as it gives you better control over it

rspec / rspec-core

How do I detect whether the full suite ran or just a subset? #2597

Subject of the issue

Your environment