apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.6k stars 1.02k forks source link

Allow Junit4 tests in our environment. [LUCENE-2037] #3112

Closed asfimport closed 14 years ago

asfimport commented 14 years ago

Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should have to be rewritten. We should start this for the 3.1 release so we can get a clean 3.0 out smoothly.

It's probably worthwhile to convert a small set of tests as an exemplar.


Migrated from LUCENE-2037 by Erick Erickson (@ErickErickson), resolved Dec 05 2009 Environment:

Development

Attachments: junit-4.7.jar, LUCENE-2037_remove_testwatchman.patch, LUCENE-2037_revised_2.patch, LUCENE-2037.patch (versions: 3), LUCENE-2037-getName.patch, LUCENE-2037-LegacyChecker.patch

asfimport commented 14 years ago

Chris M. Hostetter (@hossman) (migrated from JIRA)

It's all yours Erick.

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

LuceneTestCaseJ4 should replace LuceneTestCase. There's a bit of overkill here to emulate the override of runBare in LuceneTestCase, but I thought it was worth it to work out the mechanisms.

We'll need to put the lucene 4.7 jar in the right place.

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

One thing that would also be good: We have LocalizedTestCase, which has the possibility to run each test for all available Locales (it overrides currently runBare() and iterates while setting Locale.setDefault()). As this test should only be ran for specific methods, how about adding a annotation in addition to @Test (with Retention("method") like @TestLocalized.

What to do with BaseTokenStreamTestCase? In 2.9 it had also overridden runBare(), but not anymore (because we only have the new TS API anymore), but this is also a typical example when we want to rerun tests multiple times. One on our plan is that this test now runs all analyzer test for different default versions (iterate over Version enum constants). We need then something like `@TestAllVersions` or something like that. If we jump to JUnit4, we should use the new features for a more elegant solution of these multiple-run tests.

One note: It would be good to not reformat the whole tests with an Eclipse cleanup, just change the lines you modified, not reformat everything or organize imports and so on. Its hard to find out what has changed.

asfimport commented 14 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Is there some way to use Junit4 parameterized tests to do this LocalizedTestCase-type thing, so we don't have to override runBare()?

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Well, it all depend on how you feel about 10 seconds as far as LocalizedTestCase is concerned.

JUnit4 is really not built to run some tests in a class with the @Parameterized notation and some not, it runs all the tests in the class with all the parameters. In the case of TestQueryParser, which is the only test class I saw that made use of the "include some tests but not others' in LocalizedTestCase, I hacked in running all the tests with all the locales available (152 in my case). Which pushes the number of tests in that one class up over 4,000 FWIW.

Running that test case went from around 5 seconds to around 15 seconds on my 2 year old Macbook Pro, from inside IntelliJ.

I don't think it's worth trying to refactor that class into two classes, one that has all the tests run with all the locales and one that has the rest of the tests run only with the default locale (which is how I read the code in LocalizedTestcase) for 10 seconds worth of time savings. One could emulate the old process of excluding some tests by returning immediately from those tests that weren't intended to be run with all locales if the current locale wasn't the default, but I don't see that as worth the effort, although I could be convinced otherwise if people feel strongly.

I'll provide a patch for this if there are no objections later this week, perhaps I'll get a chance to look at BaseTokenStreamTestCase before then.

This will make LocalizedTestCase obsolete and I'll remove it in the patch.

asfimport commented 14 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I don't think it's worth trying to refactor that class into two classes, one that has all the tests run with all the locales and one that has the rest of the tests run only with the default locale (which is how I read the code in LocalizedTestcase) for 10 seconds worth of time savings. One could emulate the old process of excluding some tests by returning immediately from those tests that weren't intended to be run with all locales if the current locale wasn't the default, but I don't see that as worth the effort, although I could be convinced otherwise if people feel strongly.

Hi, the current behavior is a little silly. I thought it would be user-friendly to run the default locale first, then all the other locales (including a duplicate run of the default locale). This way if you broke a test, you would see failures in your native language/format rather than some very strange language.

On the other hand, by using LocalizedTestCase, you are saying 'hey i am saying this thing has locale-sensitive behavior and it better work', so in my opinion, getting rid of this 'first run tests under default locale' doesn't harm a thing. instead, just run under all locales, or maybe sort them so the default one is first, or something else?

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

I was thinking more about TestQueryParser. One of the features of the current setup is that you specify which tests in a class you want to have run under all locales. Tests not in that list are run only under the default locale.Always assuming I'm reading things right...

I don't see a clean way to emulate that part of the behavior without either refactoring or introducing a test in the tests we don't want to run under all locales and aborting early.

But I think we're finding different ways to agree here. I'm interpreting your comments that running all the tests in the class is OK at least for now...

But I did notice last night that a number of tests in contrib reference LocalizedTestCase (I have two separate projects, core and contrib so it wasn't obvious until I ran the ant task). I'll look into those tonight or tomorrow night.

Erick

asfimport commented 14 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I was thinking more about TestQueryParser. One of the features of the current setup is that you specify which tests in a class you want to have run under all locales. Tests not in that list are run only under the default locale.Always assuming I'm reading things right..

Erick, oh I am sorry, I completely misread your comment!

This feature that allows only a subset of tests to be run "localized" where the others will be run "normally" is just a performance optimization. all tests should (have not checked since 2.9) pass under all locales.

But I did notice last night that a number of tests in contrib reference LocalizedTestCase (I have two separate projects, core and contrib so it wasn't obvious until I ran the ant task). I'll look into those tonight or tomorrow night.

Yes, I do feel we should keep LocalizedTestCase. It is handy, we might use it in more places to prevent test failures in other locales for new code. But this "allowing only a subset of tests to be test in a localized way" is only a performance optimization. If there is some test class that is really slow because all the tests are run under every locale, perhaps the locale-sensitive tests should be refactored into a separate test class that extends LocalizedTestCase. The ones that are not locale-sensitive could be in a test class that extends LuceneTestCase

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

I already made a propsal for LocalizedTestCase: Just mark all test to run under all Locales with a Annotation like @AllLocales.

asfimport commented 14 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I already made a propsal for LocalizedTestCase: Just mark all test to run under all Locales with a Annotation like @AllLocales.

looks like Parameterized supports limiting params on a per-method basis via annotation?: http://junit.org/apidocs/org/junit/runners/Parameterized.Parameters.html

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

I think you're mis-reading this. This is the annotation for the static method that return a list of parameters, not for a method that is an actual test.....

The thing that causes the framework to gather the list and run test for each element on the list is the @RunWith annotation on the class AFAIK.

Or I'm misreading it....

Erick

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Frankly, I don't see how that would work without getting into the guts of the @RunWith(value = Parameterized.class) Junit4 annotation. As I understand it, that annotation on the class causes the framework to make a call to the static method that provides a list of parameters (annotated with @Parameters). The framework then takes the returned list and, *for each element in the list* calls a constructor with that element and runs all the tests in the class.

So annotating a test with @AllLocales would somehow have to get in there and change what the framework does. No doubt it's do-able, but until I see more than 10 seconds difference in running the tests I'm not sure it's worth it. Nor would I advocate altering the behavior of the framework for back-compat, I'd far rather refactor the tests into those that run for all locales and those that don't.

I suppose one could to the inverse, that is create an annotation @DefaultLocaleOnly that aborts early if the locale isn't the default, but again I think the first approach I'd advocate would be to work within the framework until it was too painful....

FWIW Erick

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Yes, I do feel we should keep LocalizedTestCase. It is handy, we might use it in more places to prevent test failures in other locales for new code.

Light went off when walking around. I think I can just change the LocalizedTestCase class and put the @RunWith() and @Parameters there. Which makes waaaaaay more sense than what I was doing which was putting those in every subclass of the current LocalizedTestCase. Doh! I'll take a peek tonight. Although last night I was thinking "Gee, this is repetitive"....

There are only two classes in core that use LocalizedTestCase, but there are several in contrib too. They'll all require the @Test annotation if I munge LocalizedTesCase, but that should be the only change necessary then, assuming we're content to run all the locales past all the test cases in all derived classes.....

Hmmmm, why was subclassing invented again? Something about putting common behavior in one place or some nonsense like that <G>.

Erick

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

We have a second place (I did not implement this for 3.0, because no time), we need to sometimes test something for all Version enum constants. So @Parameters would return EnumSet.allOf(Version.class) and the ctor of this TestCase would get the version, and store it per instance. Tests would then run for each possible version to test compatibility.

Problem, it interferes with LoalizedTestCase, because e.g. QueryParser needs a test in all versions (n), but also in all Locales (m). Produces n x m test runs arrrgh. Any idea? On the other hand, TestHighlighter only needs the version iteration to test fully.

asfimport commented 14 years ago

Robert Muir (@rmuir) (migrated from JIRA)

Uwe, can you elaborate more on how the Version testing would work?

do you mean the case where you have unchanged behavior across all versions, and you want to run the test against all of them, but then some tests only apply to specific versions?

Problem, it interferes with LoalizedTestCase, because e.g. QueryParser needs a test in all versions , but also in all Locales (m). Produces n x m test runs arrrgh. Any idea?

at least the old tokenstream api is done, imagine if you wanted to test version, locales, and tokenstream api :)

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Well, last night I changed LocalizedTestCase to do the @RunWith and @Parameterized thing and it works just fine with a minimal change to subclasses, mainly adding @Test and a c'tor with a Locale parameter. Total, it adds probably a minute to the test run.

About the cross product of versions and locales. The @Parameterized thingy returns a list of Object[], where the elements of the list are matched against a c'tor. So if each object[] in your list has, say, an (int, float, int), then as long as you have a matching c'tor with a signature that takes an (int, float, int) you're good to go. So to handle the mXn case you mentioned, if your @Parameters method returned a list of object[], one object[] for each <Locale, Version> pair, you'd get all your Locales run against all your versions.

Whether we want this to happen or not is another question. It's a worthwhile question whether we really need to run all the possible locales or if there's a subset of locales that would serve.

It's kind of ironic that I have a patch waiting to be applied that cuts down on the time it takes to run the unit tests and another patch that adds to the time it takes. Two steps forward, one step back and a jink sideways just for fun.....

Best Erick

asfimport commented 14 years ago

Robert Muir (@rmuir) (migrated from JIRA)

It's a worthwhile question whether we really need to run all the possible locales or if there's a subset of locales that would serve.

I won't rant too much on this, except to say that before this localizedtestcase, various parts failed under say, only Korean, or only Thai locale, it was always a corner case.

I think its important that someone from say Korea, can download lucene source code, and run 'ant test'. how else are they supposed to contribute if this does not work?

asfimport commented 14 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Can we, for this issue, focus on not altering what's tested, ie, simply cutover to juni4? And then open new issues for, eg, parameterizing tests across different Versions? If we do that, then Erick if you post your latest patch (that cuts over LocalizedTestCase -> new junit4 capabilities), then is that committable?

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Hold off on this patch until I get a chance to submit a new one, we're straightening out #2919 interdependencies between patches.....

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

See JIRA comments

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Darn it! I'll get the comments right sometime and not have to retype them after making an attachment....

Anyway, this patch allows us to use Junit4 constructs as well as Junit3 constructs. It includes a sibling class to LuceneTestCase called LuceneTestCaseJ4 that provides the functionality we used to get from LuceneTestCase.

When creating Junit4-style tests, preferentially import from org.junit rather than from junit.framework.

Junit-3.8.2.jar may (should?) be removed from the distro, all tests run just fine under Junit-4.7,jar, which is attached to this issue. I wrote a little script that compares the results of running the tests and we run exactly the same number of TestSuites and each runs exactly the same number of tests, so I'm pretty confident about this one. I may be wrong, but I'm not uncertain. Single data-points aren't worth much, but on my Macbook Pro, running under Junit4 took about a minute longer than Junit3 (about 23 1/2 minutes). Which could have been the result of my Time Machine running for all I know....

All the tests in test...search.function have been converted to use LuceneTestCaseJ4 as an exemplar. I've deprecated LuceneTestCase to prompt people. When you derive from LuceneTestCaseJ4, you must use the @Before, @After and @Test annotations to get the functionality you expect, as must all subclasses. So one gotcha people will surely run across is deriving form J4 and failing to put @Test....

Converting all the tests was my way of working through the derivation issues. I don't particularly see the value in doing a massive conversion just for the heck of it. Unless someone has a real urge. More along the lines of "I'm in this test anyway, lets upgrade it and add new ones".

What about new tests? Should we encourage new patches to use Junit4 rather than Junit3? If so, how?

I've noticed the convention of putting underscores in front of some tests to keep them from running. The Junit4 convention is the @Ignore annotation, which will cause the @Ignored tests to be reported (something like 1300 successful, 0 failures, 23 ignored), which is a nice way to keep these from getting lost in the shuffle.

When this gets applied, I can put up the patch for LocalizedTestCase and we can give that a whirl....

asfimport commented 14 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Anyone have any concerns upgrading to Junit4? I plan to commit in a few days...

asfimport commented 14 years ago

Karthik K (migrated from JIRA)

+1 w.r.t JUnit 4 .

Unrelated to this - but there is another patch - #3141 to port the existing tests to Java 5 generics . May be - somebody can have a look at it before it becomes out of sync with the trunk altogether.

asfimport commented 14 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

but there is another patch - #3141 to port the existing tests to Java 5 generics

Ahh thanks for the reminder – I can take this one as well, but, there will be conflicts b/w the two patches, I think. Should we do the generics first (simpler change, but touches many files), and then the junit4 upgrade?

asfimport commented 14 years ago

Karthik K (migrated from JIRA)

Ahh thanks for the reminder - I can take this one as well, but, there will be conflicts b/w the two patches, I think. Should we do the generics first (simpler change, but touches many files), and then the junit4 upgrade?

Personally I favor the generics patch can go in first since as mentioned - it is a large patchset but very simple in nature. I would be happy to rewire this patch ( 2037) once that goes into the tree, just in case.

asfimport commented 14 years ago

Karthik K (migrated from JIRA)

Revised patch in sync with trunk in place.

TODO:

patches have been removed due to conflicts.

Should be able to revisit them separately again soon.

asfimport commented 14 years ago

Karthik K (migrated from JIRA)

Submitting revised patch to fix test failure due to the necessity of no-arg ctor in TestCustomQuery .

asfimport commented 14 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Thanks Kay Kay. Erick, once you've had a chance to review/iterate, I plan to commit... then we can make use of junit4 features in our tests.

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Had enough time this morning to reconcile this with Kay Kay's changes,

All tests pass.

Junit 3.X no longer necessary, running with Junit 4.7 jar runs junit 3 style tests as well as annotated Junit4 style tests.

It's preferable (but not necessary) to import from org.junit rather than junit.framework.

asfimport commented 14 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

OK patch looks good, thanks Erick & Kay Kay! I'll commit shortly.

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Removed unnecessary derivation from TestWatchman.

Corrected minor typo in comment.

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

I just committed this.

One quetsion: In JUnit3, the call to getName() always created a correct testName (because JUnit3 took care about the current test running). If I inject one bug into a random test using newRandom() that is not using the ctor with name param, the additional error message about the random seed simply prints "" as test name. In the past this worked. Ideally it should print ot the current @Test method name as before.

How to do this? I would like to have this and remove the getName() stuff from the class.

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Hi Erik,

I have [after I understood the code :-) ] rewritten the TestWatchman code to now only work inside LuceneTestCaseJ4, no reflection use and added a startup(), that simply sets the correct test name in the enclosing class. I see no usage in making the InterceptTestCase public, as it is only used inside LuceneTestCaseJ4. If a subclass or localized one needs this watchman, too, you can siply create a second one. I also made the field final.

Now the tests for random failures behave like before. I would like to commit this soon. This change also affects the uncaught exception handler.

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Committed revision: 916685

asfimport commented 14 years ago

Erick Erickson (@ErickErickson) (migrated from JIRA)

Uwe:

You were asking about getName in LuceneTestCaseJ4. It appears that you've taken care of this, is there still anything to do? There's no longer a c'tor that takes the test name.

But I did some poking around and came up with the following from "someplace on the web".

The only two place I could find that used getName were TestFieldScoreQuery and TestOrdValues. This bit of code works if you put it in these classes.

private String testName() { return getClass().getName()"." name.getMethodName(); // was getName() from LuceneTestCaseJ4... }

@Rule public final TestName name = new TestName();

See: http://kentbeck.github.com/junit/javadoc/4.7/org/junit/rules/TestName.html Note that this site is better than anything I could find at junit.org....

Once I found that, I thought "gee, if I put that in the base class, it would be available to everyone". Which is exactly what you made LuceneTestCaseJ4.getName() do <G>. But at least I found Kent Beck's version of the docs, which is a plus...

So I guess there's nothing to do as far as getName is concerned.... If there is, let me know....

Erick

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Erick,

thats already fixed in trunk with my last commit, as you noticed! It exactly does what also rules.TestName does :) - I found this class later too and realized that it does the same – only that lucene has the method in the base class for better migration experience :-).

Yesterday I also wrote an extra test assertion, that verifies, that the ported test class has all methods starting with "test" annotated with @Test. Robert and me wants to maybe apply this patch during the migration phase when developers are not yet using Junit4 so long and forget to add @Test. The path is very rough and maybe optimized (if @BeforeClass could be used, but cannot as static).

The string-ctors are not used in lucene, as the testName in Lucene should be automatically from the current method. The additional ctors in Lucene's tests were only very very very old junit3 relicts (later versions of junit3 also do not need it anymore, they set the test name after instantiating).

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Here the patch for the additional assertion to test if all ported test classes have all @Test added to all methods starting with "test".

asfimport commented 14 years ago

Uwe Schindler (@uschindler) (migrated from JIRA)

Committed the assertion patch revision: 922887