rspec / rspec-core

RSpec runner and formatters
http://rspec.info
MIT License
1.22k stars 763 forks source link

Make --random the default in rspec 3 #635

Closed myronmarston closed 12 years ago

myronmarston commented 12 years ago

The --random option is awesome and I use it on all my projects. I imagine the main reasons it's not universally used is that it's relatively new (and thus, lots of folks don't know about it) and on older projects that do have accidental ordering dependencies, it could take a bunch of effort to fix all the ordering dependencies.

When we added it in a minor rspec 2.x release it made sense not to have it on by default (since it could have caused lots of breakage for people when upgrading) but I think we should make it the default in rspec 3.

Thoughts?

dchelimsky commented 12 years ago

I actually think it's not a good default because it's confusing for someone who is unaware. I'd rather raise awareness by mentioning it in the readme and possibly making it the default in generated artifacts (like spec/spec_helper.rb), but not within rspec-core itself.

alindeman commented 12 years ago

It's the default for minitest, and has definitely saved me from a non-obvious bug or few.

I think it's a good idea: I'd maybe rather folks be a little more confused upfront than write bad specs? Or is that not a good position to hold?

myronmarston commented 12 years ago

I guess it's partially a question of the intention of the defaults: are they the settings recommend by the rspec core team? Are they meant to be optimized to reduce confusion for newcomers? Something else?

dchelimsky commented 12 years ago

Generally, I think defaults should follow the principle of least surprise.

justinko commented 12 years ago

I'm afraid if we don't introduce randomization as the default, it will never be widely adopted. Users will not "seek out" the different ways to order their examples...

Many users are not even aware of "order-dependent tests"; I wasn't until Myron introduced this potential feature for RSpec many months ago.

pjg commented 12 years ago

I'm +1 on making random the default. I feel like enforcing better/more reliable tests is the right approach. Even though it's harsher on the user... and will definitely introduce a wave of "My specs are failing randomly after upgrading to latest RSpec" posts/questions.

probonogeek commented 12 years ago

Randomness breaks the "specs as documentation" that you get when running specs in nested format (which I do a lot) and I also think it makes debugging all that much harder. We've been converting over nearly 3000 specs from rspec1 to rspec2 (yes, I know we are late) and I would have flipped out if I had to deal with random specs every time. I would agree with David, teach users about the functionality, don't force it.

rodrigotassinari commented 12 years ago

+1 on making random the default, same reasons of @pjg

revans commented 12 years ago

I also think random order should be the default for rspec. I use it on all my projects.

I think it violating principle of least surprise is debatable. As a testing framework, having it not test order-dependency violates that principle for me. It [rspec] should be guiding the users/developers towards best practices.

wallace commented 12 years ago

+1 for random default

chrismcg commented 12 years ago

There's two ways I could be surprised here.

Say random is on by default, I could be writing a spec and it fails because I accidentally introduced an order dependency. This may be confusing to me until I understand specs are been run in a random order but the failure happens close to where I was coding so I can hopefully fix it straight away.

Say random is off by default. I'm writing the same spec with the same order dependency but I don't trigger a failure because specs are running in the same order. I'm happy with my feature and some months later something happens to trigger a different spec order, now the spec blows up but it's not going to be as clear why or as easy to find and fix.

The second scenario has happened to me months and even years later, so I'd like to see random being on by default so I don't have to switch it on myself.

rosenfeld commented 12 years ago

-1 - for some integration tests it makes sense to me to rely on prior tests to be successful.

Take a look at this discussion I've started on Buster.js as well:

https://github.com/busterjs/buster/issues/194

jwo commented 12 years ago

I am team-random on this one.

I've had specs only work in a certain order before (my fault, obviously), and I like that minitest randomizes the order for you. And, I'm one of the people that didn't even know you could randomize rspec until recently -- that's why defaults are so important.

I submit that the test for defaults shouldn't be, "keep the number of complaints to a minimum", it should be "have the best product" In my opinion, that's randomizing the tests to promote test isolation.

nwjsmith commented 12 years ago

+1 for random default. RSpec already encourages good tests, this would just further that.

jrgifford commented 12 years ago

+1 for random default.

tonyc commented 12 years ago

I like @dchelimsky's idea, new projects will have it as a sensible default (and people won't have order-dependent tests), but "legacy" users will still get their old, order-dependent tests passing on an upgrade. Seems like a reasonable balance.

I can see both sides of the issue - I like the idea of making sure your tests aren't order-dependent, but wouldn't it make it harder to catch ordering-related test failures if each run is randomly ordered?

On the surface, it seems that making specs randomly ordered aren't a guarantee of order-independency (If that makes sense). It seems like it could also lead to "phantom" CI failures if an obscure order-related bug is introduced that fails a build, and then everybody else thinks tests are running clean.

pjg commented 12 years ago

BTW I think that after making random the default, an option like --ordered could be added, which should satisfy the naysayers ;)

sobrinho commented 12 years ago

-1

I think that should be an option, but not the default behavior, following the principle of no surprises by default.

DCarper commented 12 years ago

Generally speaking I think things should default to best practice.

I think the ordering should default to random. Is anyone going to argue that it's not better going forward? If it breaks old test suites and people don't want to fix the issues themselves, they can always specify the file system's ordering

Also, to me personally I don't think it would be out of place to include this as a configurable option in spec_helper as opposed to a command line option, lots of which is focused on formatting and specifying which examples to run. This might help to increase awareness of the option itself.

One other thought on making it part of spec_helper instead of the command line utility is that the order you want to run tests in probably doesn't change? You probably make the decision to run in random order or not. Whereas the output formatting / filtering / tags / things like debugger change nearly almost every time you use the rspec command.

rosenfeld commented 12 years ago

Integration tests can run really slow under some circumnstances. You can get your specs running faster if you assume some pre-conditions created by your previous examples instead of recreating that situation all over again (which could be quite slow) or by the use of lots of mocks (which can make your examples more complex and more error-prone as well).

cdemyanovich commented 12 years ago

@probonogeek How does random order break the documentation format for you? I just tried running specs with --format documenation --order random. It appears that example groups are randomized and then examples within the groups are randomized. Which examples go with which groups is preserved, though.

Since this change is proposed for a major release and not a minor or patch one, I favor making randomness the default. Hopefully, people are consciously upgrading to a major release after reading announcements and/or change logs and basing their decision on the value that the new release will add to their project.

darrinholst commented 12 years ago

+1 on random being the default. It would have helped us recently when our ci workspace got moved to another mount and caused the specs to be run in a different order which uncovered a test dependency. Would have been nice to fail fast.

srushti commented 12 years ago

What I would like to have is, to be able to rerun the specs in an order that just failed. Basically, otherwise when I screw up and I have test A screwing up the system, which causes test B to fail, the test results will only show test B failing while the problem will actually be in test A.

If I can re-run it in the same order it'll be easier to narrow down the problem.

hgmnz commented 12 years ago

+1. In general the framework should encourage good tests. By setting random to default, we are saying "You must write tests that teardown correctly"

myronmarston commented 12 years ago

What I would like to have is, to be able to rerun the specs in an order that just failed. Basically, otherwise when I screw up and I have test A screwing up the system, which causes test B to fail, the test results will only show test B failing while the problem will actually be in test A.

You can already do this. Rspec prints out the random seed that it used at the end of the test run, and you can re-run with --seed 1234 (or whatever) to reproduce the bug exposed by the randomization.

jeremygpeterson commented 12 years ago

+1 to make it the default for new projects.

lmcalpin commented 12 years ago

+1 for random default. tests should not depend on order and if the framework encourages that, that's a good thing.

seancribbs commented 12 years ago

If you are setting up integration tests by the results of previous tests in the same group, you're doing it wrong and your suite will one day break spectacularly. What if you run an example in isolation to debug it, and then it breaks in a different way? I think that's more surprising than finding the second breakage earlier. Besides, we have shared contexts for examples that need the same setup.

Needless to say, I'm very :+1: for random by default, as long as the seed is properly set on JRuby automatically (in basho/riak-ruby-client I have to seed with the current time).

goshacmd commented 12 years ago

+1 for making --random the default.

lsegal commented 12 years ago

-1.

There's a disparity between "ideal conditions" and pragmatic results here. A perfect test suite that has no inter-dependency issues is the ideal case, and we should strive for that, but we don't need to assume it's "perfect code or bust". Furthermore, random by default punishes new users and doesn't really affect those who strive for perfection. Experienced users who care about independent tests can easily opt into --random, but new users will be seriously confused by tests that fail randomly. The comment about using --seed N to lock into a test run is nice, but I seriously doubt a new user would be able to figure this out on their own. Adding randomness to test runners completely throws "the element of least surprise" out the window.

There's also the problem of making tests more brittle by assuming they are independent. This causes maintainers to get false positives from users running specs on other machines, and also CI machines, at random intervals. The random part means the maintainer can't immediately identify if the failure is a legitimate regression, or an isolated test build issue that only affected that one user/build. That's a lot of extra strain on maintainers who often get bug reports and notifications about broken builds. This is why I personally run tests in ordered mode, but occasionally do a few local tests with --random to make sure the tests are well formed. That way, I can guarantee that what works locally will work everywhere while still cleaning up the tests if there are inter-dependency issues. I don't think a problem with your build environment should cause random failures unless you explicitly ask for that kind of testing. That just seems wrong to me.

I should also add (and reiterate what was mentioned above) that --random doesn't even guarantee that you will identify inter-dependency issues. It just means that at some point in the future, you may identify the issue. That point in the future may be minutes, hours or even days, and that's a problem, because you must always be ready to take time out of development to debug issues that might be completely unrelated to the code you are currently working on. I don't know if that's a good default behaviour.

DCarper commented 12 years ago

Encouraging bad test practice is wrong.

@isegal I define brittle exactly the opposite as you do. If I modify one test and that breaks another, that is extremely brittle to me, not the other way around ;/

Making it easier for the less experienced to write dependent and brittle tests is wrong.

Writing dependent integration tests for performance bonuses is also not right in my opinion.

Speaking to the "Defaults should follow the least surprises theory"... Walking that argument down the far end of the spectrum, if one doesn't want surprises, then one shouldn't be using 3rd party code altogether.

The only legitimate reason for not defaulting to random is breaking existing test suites at which point I say just specify the default ordering or don't upgrade.

JoshCheek commented 12 years ago

-1

Same reason as @probonogeek My --format documentation order gets mixed up. I sometimes rely on that ordering. E.g: 'it does a when A', 'it does b when B', 'it does c for everything else'

cmaujean commented 12 years ago

+1 for default randomness. randomizing test order ensures that tests aren't order dependent. "Least surprise" adherents should be reminded that the real surprise is finding out that your tests are order dependent when you weren't intending them to be. A simple "Tests will be run in random order by default" in the README should be enough to mitigate any surprise you might have, and reading the documentation is a key part of properly using any library.

On the down side, I love documentation mode, and generally have that set in my .rspec file. This would kill some of the value of that, unless everything was run randomly and then the documentation was spit out in the proper order. I would be happy with a compromise such as this.

rosenfeld commented 12 years ago

Let me do a wild guess here (although I truly believe the guess is valid): how many of you on the pro-random side are actually writing full integration specs for JavaScript intensive applications (like one-page apps) and are using jQuery live events?

For those under such scenario, I'd really appreciate an article on how you write full integration specs for your complex applications (with drag-and-drop, etc) while not relying on examples running order.

Also, I'm pretty curious on how much your integration test suite takes to complete so that you can run your tests in any order. How many seconds/minutes/hours would that be?

ahawkins commented 12 years ago

random default. Mini test has caught order dependent bugs before

rosenfeld commented 12 years ago

Then, suppose that you'll need about 1s for setting up a context (reload the page, set up database, etc) and have 10 single examples that would take about 1s to run in a certain order. You'd spend 10s to run the same examples just to make them independent from each other.

If I write my specs like these I'll stop writing specs as they would take forever to run my specs...

lsegal commented 12 years ago

@DCarper, we're talking about brittle in different contexts: running vs writing. The question is which do you want to optimize for? Given that tests are run far more often than they are written, I would think you'd optimize to have them run robustly. In other words, what you run is what you get. That tends to cause much fewer surprises in my experience. In your case, if you modify one test and it breaks another, you have at least isolated the inter-dependency to a specific set of tests. That would not be the case for --random, where two sequential runs could fail for wildly different reasons. If my tests are going to fail at all, I'd rather have them fail in the more deterministic manner-- and at least only when I've changed something, not just because I've re-run the tests a week later.

Also, not using --random does not imply that you are encouraging bad tests. It's not a black and white issue. As I pointed out, you don't need to tie your build process to your test results, you can ensure good tests in parallel with reliable test results.

lsegal commented 12 years ago

@rosenfeld you can write before blocks that perform your fixture setup for a group of tests. That shouldn't be affected by ordering.

justinko commented 12 years ago

FYI everyone, RSpec randomization isn't "true" randomization. The groups are randomized, and the examples within each group are randomized. To put it another way, examples are only randomized within their group.

If RSpec supported "true" randomization, before(:all) wouldn't work.

rosenfeld commented 12 years ago

I was not saying it was impossible. Just that it would be painfully slow for a JavaScript intensive application that also need lots of records to exist in the database...

justinko commented 12 years ago

Let me clarify why I said that.

If RSpec supported "true" randomization, it would break the documentation formatter. The current randomization implementation does not affect the formatters in any way.

DCarper commented 12 years ago

@lsegal we were talking about two different things. I still think that if test B passes when run after test A, but fails if run after test C, that's brittle.

It seems like people are saying, "Our tests are slow, we'll fix that by sacrificing test quality."

Performance aside, is anyone actually arguing that it's better practice to have order dependent tests?

I also love the argument that all of the extra false positives one will get from order-dependent tests represent far worse surprises than finding out rspec defaults to a random ordering at some point in the TDD / Rspec learning curve.

lsegal commented 12 years ago

@DCarper you seem to be arguing on boolean logic here. Writing good software always involves a balance of competing interests. The question isn't whether order dependent tests are or aren't better practice; I don't think you would get pushback on that question. The problem is that there are cost/benefit questions to all things that require extra effort, and not all projects prioritize this best practice as worth the effort of more complicated debugging. For example, documenting your code is also generally considered a "best practice", but most developers would not consider a tool that defaulted to asking you to document every method in your codebase as an acceptable default-- not because documentation is unimportant, but because the cost is usually much higher than the benefit.

The real question is whether --random effectively encourages the best practice while also making it easy enough to deal with when things go wrong. Is there any actual data in the wild about how randomized tests affect code quality? Note: not build quality, code quality. As I mentioned before, independent tests is a best practice that every developer should absolutely strive for-- but I'm not personally convinced that it's a practice developers should start with as a priority above other practices, like, say, spending that effort to improve test coverage rather than build quality.

DCarper commented 12 years ago

@lsegal I see what you're saying and I totally agree, all best pratices / rules / whatever, are definitely a matter of balance and contextual feasibility.

Random ordering isn't directly about code quality, it's about test quality which in turn affects code quality.

For the less experienced, it's encouraging good test-practice instead of less than good. If someone does reach the point where they they're making a conscience decision to trade quality for performance, that's fine, but at least let them opt into that decision.

I think defaulting to random is completely consistent with the Ruby / Rails way. Provide and push as many best practices for free and let the individual decide when they don't make sense.

radar commented 12 years ago

+1

I ran into an issue with Spree's tests just yesterday where one test was relying on another test running before it to set up data for it. I think this is a sign for a broken test.

Indicating to people that it's random during the run should be enough. A seed is also a great idea.

gicappa commented 12 years ago

-1

IMO rspec shouldn't teach anything or strive to inspire peoples' tests to perfection.

It should just be a awesome tool (as it is) for any kind of people: for the ones who use it right by the book but for all the others who mess up integration unit acceptance and whatever kind of test you name.

oldfartdeveloper commented 12 years ago

+1

Though I'm in sympathy with the contrarians that argue that it might be too costly in certain situations, I fall on the side that it will save over the long run in most cases. That being said, I agree that this "feature" needs to be documented prominently where noobe's will see it with an opportunity to opt out of --random.

I also like the idea of the default being assigned in spec_helper.rb

captainpete commented 12 years ago

Random by default catches more bugs, which is good. Random by default does not catch all the dependency bugs and it's dangerous to assume that it will, by default. This proposed default encourages folk to write tests that do not have side-effects. There's an opt out.

+1

dnagir commented 12 years ago

No, for local development. And here's why:

But I would say Yes for CI server since all the points above don't make sense there.

Most people use RSpec primarily as a TDD tool. Making random a default will make them less productive.

As a result, I think it should be off by default. There's no big deal enabling it.

I hope this won't become a drama similar to returning boolean in rails :)

radar commented 12 years ago

No @dnagir, I (respectfully/rspectfully?) disagree with everything you said. Here's why.

least surprise - I want to see specs in the order I write them

You only want to do this because you're used to this now. Personally, I don't alphabetise my development. One day I'll be working on product_spec.rb and then the next adjustment_spec.rb. Alphabetisation is not how normal people work. Your argument is silly.

lost time to match output to the specs, I'll have to spend more time to "fix" randomisation

This is time you would inevitably lose anyway because your specs are poorly written. Consider the randomisation as battle testing for your specs. If they are written as they should be, then they would not fail based on the order. Read this very carefully: No spec file should depend on setup from another spec file to execute. They should be executable individually.

unreliable failures/successes - yes, you'll find the shared state issues, but the time from previous point is longer than this, not worth it.

With the --seed option you will be able to run the specs in an identical order, allowing you to track down failures caused by random specs.

slower TDD - because we have to deal with random output

How? The same specs will run, it'll only be the order that's changed. By changing the order of three operations, taking 1, 3 and 5 seconds long each, doesn't magically make it take any more/less than 9 seconds. Basic math, really.

But I would say Yes for CI server since all the points above don't make sense there.

I actually agree with this point. I want my shit to break on CI because CI is sanity-checking my code and preventing "Works on my machine!" style responses.

Most people use RSpec primarily as a TDD tool. Making random a default will make them less productive.

Simply untrue. Time will be saved by people learning to write specs properly and not spent tracking down failures in their specs because this one time they chose to run that spec file individually rather than with others.

As a result, I think it should be off by default.

Strongly disagree. There'd be much table flipping from this side if the "off-by-default" crew got their way.

There's no big deal enabling it.

Well, actually... GO TO LINE 1.

I hope this won't become a drama similar to returning boolean in rails :)

I hope also, but if the opposing team keeps coming up with such ridiculous arguments then I can see it heading that way for sure.