apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.65k stars 1.03k forks source link

Extract a generic framework for running randomized tests. [LUCENE-3492] #4566

Closed asfimport closed 12 years ago

asfimport commented 13 years ago

The work on this issue is temporarily at github (lots of experiments and tweaking): https://github.com/carrotsearch/randomizedtesting Or directly: git clone git://github.com/carrotsearch/randomizedtesting.git {color}


RandomizedRunner is a JUnit runner, so it is capable of running @Test-annotated test cases. It respects regular lifecycle hooks such as @Before, @After, @BeforeClass or `@AfterClass`, but it also adds the following:

Randomized, but repeatable execution and infrastructure for dealing with randomness:

Thread control:

Improved validation and lifecycle support:

Screen Shot 2011-10-06 at 12.58.02 PM.png


Migrated from LUCENE-3492 by Dawid Weiss (@dweiss), resolved Feb 14 2012 Attachments: Screen Shot 2011-10-06 at 12.58.02 PM.png Linked issues:

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Static fixtures couldn't be handled with a rule, so I've decided to rewrite JUnit Runner instead of subclassing it. Lots of frustration so far, but I like the result :)

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

I've implemented a runner that follows the basic algorithm given in #4563. Basically speaking, seeds for each test run are fixed derivations of a single master seed (used for the runner and all class-level fixtures) and don't rely on the order of invocations or other factors.

There's plenty of ways to tweak and tune by overriding class-level @Seed, method-level @Seed. @Repeat gives you control on how many times a given test is executed and whether a seed is reused (constant for each iteration) or randomized (predictably from the start seed).

Most of all, everything fits quite nicely in Eclipse (and I hope other GUIs... didn't check Idea or Netbeans though) because each executed test run is nicely described in the runner (full seed), so that you can either click on it and re-run a single test or write down the seed and fix it at runtime.

Lots of TODOs in the code, will continue in the evening.

asfimport commented 13 years ago

Shai Erera (@shaie) (migrated from JIRA)

This is only for debugging from an IDE right? It does not replace tests.iter and tests.seed?

It looks very cool.

It also adds a risk that someone will accidentally commit tests with these annotations. So perhaps we should add pre-commit hooks, or a test that scans all test files and ensures those annotations do not exist?

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Hi Shai. This is definitely not only for debugging. For example we use randomized testing inside CarrotSearch to test algorithmic/ combinatorial code. Once you hit a bug, you simply copy the test case (or a call to a common test case method) and fix the seed to have a regression test for the future (so that you know you're not failing examples that previously failed). So, for example:

`@Test` @Seed("23095324")
public void runFixedRegression_1 { doSomethingWithRandoms(); }

`@Test` @Seed("239735923")
public void runFixedRegression_2 { doSomethingWithRandoms(); }

`@Test`
public void runRandomized { doSomethingWithRandoms(); }

This is a scenario I really came to like. It's a bit like your tests write themselves for you :)

I left system properties for fixing seeds and enforcing repetition number because they are currently in Lucene, although I personally don't like them that much (because they affect everything globally). I do understand they're useful for quick hacking without recompiling stuff or for remote executions, but I'd much rather have something like -Dseed.testClass[.method]=xxxx which would affect only a single class or method rather than everything. The same can be done for filtering which method/ test case to execute. This is debatable of course and a matter of personal taste.

I should publish what I have tonight on github (I'm moving certain things out of our proprietary codebase and there are JUnit corner cases that slow things down).

asfimport commented 13 years ago

Shai Erera (@shaie) (migrated from JIRA)

Ok I get the point now.

But I still think we should have specific unit tests that reproduce specific scenarios, than using some monstrous tests that happened to stumble on a seed that revealed a bug. If however the scenario cannot be reproduced deterministically, then I agree that this framework is powerful and useful.

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Sure, absolutely. In our (mostly algorithmic, mind you) experience even small test cases can be randomized and then it is really duplicated effort to re-write them for a particular bug scenario (the tests are often simple, the data changes). But sure: the simpler the test, the better.

asfimport commented 13 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I agree too. one difficulty with using @seed or something is our seeds quickly become out of date because we are often adding more randomization to our testing framework (e.g. additional craziness to randomindexwriter, searchers, analyzer, whatever)

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

That's why I mentioned I would like this to become generally useful, not only restricted to Lucene/Solr :) If we make it work for two projects (Carrot2 and Lucene) chances are the outcome will be flexible enough to use elsewhere.

I'm not saying you must fix the seeds using annotations – it's an option.

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Ok. I've published the project on github here: https://github.com/dweiss/randomizedtesting

The repo contains the runner, some tests and examples. Lots of TODOs (in TODO), so consider this a work-in-progress, but if anybody cares to take a look and shout if something is definitely not right – go ahead.

mvn verify on the topmost project compiles everything and runs the tests/ examples. I don't see any functional deviations or differences in execution between ant maven and my Eclipse GUI (mentioned by Robert) which is good.

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

A word of warning: this will be a longer comment. I still hope somebody will read it ;)

I've written a somewhat largish chunk of code that provides an infrastructure to run "randomized", but "repeatable" tests. I'd like to report on my impressions so far.

Robert was right that a custom runner provides more flexibility than a @Rule on top of the default JUnit runner (which changes depending where you run it – ant, maven, Eclipse, etc.). I've spent a lot of time inspecting the current implementation inside JUnit and I came to the conclusion that it really is best to have a full reimplementation of the Runner interface. Full meaning not descending ParentRunner, but implementing the whole runner from scratch. This provides additional, uhm, unexpected benefits in that one can add new functionality that "regular" JUnit runners don't have and still be compatible with hosting environments such as Ant, Maven or Eclipse (because they, thank God, respect @RunWith).

Among the things I have implemented so far that are missing or different in JUnit are:

In short: I'm really happy with a custom Runner.

As for the infrastructure for writing randomized test cases:

Now… if you're still with me you're probably interested how this applies to Lucene. The wall I've hit is the sheer amount of code that any change to LTC affects. I realized it'd be large, but it's just gargantuan :)

The major issue is with static initializers and static public methods called from them that leave resources behind. I'm sorry, but nobody can convince me this isn't evil. I understand certain things are costly and require a one-time setup, but these should really be moved to @BeforeClass fixture hooks. If one really needs to do things once at JVM lifespan level a @BeforeClass with some logic to perform a single initialization can be a replacement for a static initializer (even if it's unclear to me when exactly such a fixture would be really needed). In short: the problem with static initializers is that they are executed outside the lifecycle control of the runner… I'd say most of the problems and current patchy solutions inside LTC (dealing with resource tracking for example) are somehow related to the fact that static initializers and static method calls are used throughout the codebase.

I am currently wondering if it's feasible to provide a single patch that will make a drop-in replacement of LTC. It may be the case that adding another skeleton class based on the "new" infrastructure and rewriting tests one by one to use it may be a more sensitive/ sensible way to go.

The runner (alone) is currently at github if you care to take a look. I think Barcelona may be a good place to talk about this face to face and decide what to do with it. I'm myself leaning towards the: have parallel base classes and port existing tests in chunks.

asfimport commented 13 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I am currently wondering if it's feasible to provide a single patch that will make a drop-in replacement of LTC. It may be the case that adding another skeleton class based on the "new" infrastructure and rewriting tests one by one to use it may be a more sensitive/ sensible way to go.

Could we add this stuff, cut over our runner to it (our runner doesnt actually do that match?) and then migrate our base class functionality piece by piece to the runner code (nuking it from LuceneTestCase?)

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

Yeah, I was thinking about that too and I've actually started, replaced the runner successfully but then there is no simple "piece by piece" with all the static method calls tangled together... And RandomizedRunner requires a different seed format etc... I'll give it another shot though.

asfimport commented 13 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

In response to my question whether the idea of randomized testing is new Yuriy Pasichnyk passed me the info about Haskell's QuickCheck project. Indeed, the idea is pretty much the same (with differences concerning implementation details, not the concept itself).

http://en.wikipedia.org/wiki/QuickCheck

There is a Java port of this too, if you check out Wikipedia. The implementation follows a different direction compared to what I implemented, but there are also pieces that are nearly 1:1 identical copies. Good to know – this means I wasn't completely wrong in my goals.

asfimport commented 12 years ago

Dawid Weiss (@dweiss) (migrated from JIRA)

I consider this issue done. I've pulled all the super-cool features from Lucene/Solr and put them into a separate, stand-alone and reusable project called randomizedtesting. We have switched our infrastructure at Carrot2 to use it and I'm very happy with it.

http://labs.carrotsearch.com/randomizedtesting.html

I will file another issue concerned with progressively moving from LuceneTestRunner/Case and tests infrastructure to RandomizedRunner (and siblings).