Open vladak opened 5 years ago
like in https://travis-ci.org/oracle/opengrok/jobs/499311368 but I have also seen the failures in AppVeyor builds.
From my experience it fails because of this race condition which is bugging us for a long time: https://github.com/oracle/opengrok/issues/2030
The job you provided is still running and I see no logs :/
Sorry, just restarted the build because this was a release build.
Dne st 27. 2. 2019 17:01 uživatel Adam Hornáček notifications@github.com napsal:
From my experience it fails because of this race condition which is bugging us for a long time: #2030 https://github.com/oracle/opengrok/issues/2030
The link you provided is still running and I see no logs :/
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/oracle/opengrok/issues/2704#issuecomment-467920134, or mute the thread https://github.com/notifications/unsubscribe-auth/ACzGDGslvbm4dxSosDInUKTR1Lt2TtaTks5vRqvCgaJpZM4bU2fm .
Getting more failure in SuggesterControllerTest
on Windows:
testGetPopularityDataDifferentField
: https://travis-ci.org/oracle/opengrok/jobs/637835530testGetPopularityDataAll
: https://travis-ci.org/oracle/opengrok/jobs/639408758The before
method of the test has timeout set to 15 seconds:
await().atMost(15, TimeUnit.SECONDS).until(() ->
getSuggesterProjectDataSize() == env.getProjectList().size());
which makes me wonder whether this is enough and how/if Awaitility provides any feedback/log if the timeout expired and the condition is not met.
Also, for both the failed cases listed above the setup of the test reported:
Jan 16, 2020 9:09:10 AM org.opengrok.suggest.SuggesterProjectData initSearchCountMap
SEVERE: Could not create ChronicleMap, most popular completion disabled, if you are using JDK9+ make sure to specify: --add-exports java.base/jdk.internal.ref=ALL-UNNAMED --add-exports java.base/jdk.internal.misc=ALL-UNNAMED --add-exports java.base/sun.nio.ch=ALL-UNNAMED
java.lang.IllegalArgumentException: Average key size must be a positive, finite number
at net.openhft.chronicle.map.ChronicleMapBuilder.checkAverageSize(ChronicleMapBuilder.java:314)
at net.openhft.chronicle.map.ChronicleMapBuilder.averageKeySize(ChronicleMapBuilder.java:531)
at org.opengrok.suggest.popular.impl.chronicle.ChronicleMapAdapter.<init>(ChronicleMapAdapter.java:52)
at org.opengrok.suggest.SuggesterProjectData.initSearchCountMap(SuggesterProjectData.java:300)
at org.opengrok.suggest.SuggesterProjectData.init(SuggesterProjectData.java:158)
at org.opengrok.suggest.Suggester.lambda$getInitRunnable$1(Suggester.java:185)
at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
For completeness, this was on Windows with OpenJDK 8.
w.r.t. suggester timeouts: both failed cases listed above had:
INFO: Suggester successfully initialized (took 4 seconds)
so it seems the 15 seconds is enough.
The question is whether the failure to initialize Chronicle map can lead to the observed test case failures, i.e. if there is a implicit dependency. Given that the failed test cases do test popularity, this seems plausible.
added a catch for IllegalArgumentException
in initSearchCountMap()
and for the failed case it reported the value of conf.getAverageKeySize()
like so:
SEVERE: Could not create ChronicleMap due to invalid key size: NaN
Added a "debug" print to save()
and it verifies @ahornace's hypothesis that the data in the map.cfg
file already contains the NaN
:
Mar 06, 2020 4:02:37 PM org.opengrok.suggest.popular.impl.chronicle.ChronicleMapConfiguration save
INFO: Saving chronicle map configuration in directory C:\Users\travis\AppData\Local\Temp\data9188628379176297630\suggester\json for field 'refs': entries: 0 , average key size: NaN
Mar 06, 2020 4:02:37 PM org.opengrok.suggest.SuggesterProjectData initSearchCountMap
SEVERE: Could not create ChronicleMap from directory C:\Users\travis\AppData\Local\Temp\data9188628379176297630\suggester\json for field refs due to invalid key size: entries: 0 , average key size: NaN
java.lang.IllegalArgumentException: Average key size must be a positive, finite number
at net.openhft.chronicle.map.ChronicleMapBuilder.checkAverageSize(ChronicleMapBuilder.java:314)
at net.openhft.chronicle.map.ChronicleMapBuilder.averageKeySize(ChronicleMapBuilder.java:531)
at org.opengrok.suggest.popular.impl.chronicle.ChronicleMapAdapter.<init>(ChronicleMapAdapter.java:52)
at org.opengrok.suggest.SuggesterProjectData.initSearchCountMap(SuggesterProjectData.java:300)
at org.opengrok.suggest.SuggesterProjectData.init(SuggesterProjectData.java:158)
at org.opengrok.suggest.Suggester.lambda$getInitRunnable$1(Suggester.java:185)
at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(ForkJoinTask.java:1386)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Both numbers on the right side of the expression here https://github.com/oracle/opengrok/blob/1a0db5e6d30cea2cbce9fec33d2acfd09e64a492/suggester/src/main/java/org/opengrok/suggest/SuggesterProjectData.java#L264 are 0 for the failed case, thus giving NaN
for the double
value (that will be represented as Double
in the averageLengths
map).
The workaround could be not to put the entry to the averageLengths
map if the numeric value is NaN
. The getAverageLength()
used in initSearchCountMap()
will return default (non zero) value in such case.
Interestingly the setup/tests seems to fail only when failing to build average lengths map for field refs
for the testdata/sources/json
directory.
Actually, the ChronicleMapAdapter()
init actually fails on Linux builds too, it just does not lead to test failure. Either the init failure is really just a red herring or is interpreted differently on Windows builds. Anyhow, this should be fixed as suggester above - if there is no index data for particular field it should be ignored w.r.t. averge key lengths.
The popularity tests fail because the term count cannot be updated using SuggesterServiceImpl.getInstance().increaseSearchCount()
due to rebuild being in progress.
It seems that the trouble is that SuggesterControllerTest#before()
waits merely for suggester to be initialized (Suggester#projectData
to be filled) while lots of the test cases are sensitive to suggester rebuild (rather than init).
In one of the failed cases the log goes like this:
Mar 23, 2020 4:11:25 PM org.opengrok.suggest.Suggester shutdownAndAwaitTermination
INFO: Suggester successfully initialized (took 5 seconds)
Mar 23, 2020 4:11:25 PM org.opengrok.web.api.v1.suggester.provider.service.impl.SuggesterServiceImpl scheduleRebuild
INFO: Suggester rebuild not scheduled
Mar 23, 2020 4:11:26 PM org.opengrok.suggest.Suggester rebuild
INFO: Rebuilding the following suggesters: [bug16465, archives, data, scala, clojure, document, pascal, bug19195, issue10, sql, rust, csharp, java, fortran, json, perl, tcl, swift, haskell, python, c, kotlin, lisp, rfe2575, bug16070, javascript, script, bug3658, php, scopes, bug14924, bug3430]
Mar 23, 2020 4:11:26 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [localhost:60342]
Mar 23, 2020 4:11:26 PM org.glassfish.jersey.test.grizzly.GrizzlyTestContainerFactory$GrizzlyTestContainer <init>
INFO: Creating GrizzlyTestContainer configured at the base URI http://localhost:62271/
Mar 23, 2020 4:11:26 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [localhost:62271]
Mar 23, 2020 4:11:26 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer-1] Started.
Mar 23, 2020 4:11:26 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [localhost:62271]
Mar 23, 2020 4:11:27 PM org.glassfish.jersey.test.grizzly.GrizzlyTestContainerFactory$GrizzlyTestContainer <init>
INFO: Creating GrizzlyTestContainer configured at the base URI http://localhost:57989/
Mar 23, 2020 4:11:27 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [localhost:57989]
Mar 23, 2020 4:11:27 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer-2] Started.
Mar 23, 2020 4:11:27 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [localhost:57989]
Mar 23, 2020 4:11:27 PM org.glassfish.jersey.test.grizzly.GrizzlyTestContainerFactory$GrizzlyTestContainer <init>
INFO: Creating GrizzlyTestContainer configured at the base URI http://localhost:56937/
Mar 23, 2020 4:11:28 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [localhost:56937]
Mar 23, 2020 4:11:28 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer-3] Started.
Mar 23, 2020 4:11:28 PM org.glassfish.grizzly.http.server.NetworkListener shutdownNow
INFO: Stopped listener bound to [localhost:56937]
Mar 23, 2020 4:11:28 PM org.glassfish.jersey.test.grizzly.GrizzlyTestContainerFactory$GrizzlyTestContainer <init>
INFO: Creating GrizzlyTestContainer configured at the base URI http://localhost:54678/
Mar 23, 2020 4:11:28 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [localhost:54678]
Mar 23, 2020 4:11:28 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer-4] Started.
[ERROR] Tests run: 37, Failures: 1, Errors: 0, Skipped: 1, Time elapsed: 77.04 s <<< FAILURE! - in org.opengrok.web.api.v1.controller.SuggesterControllerTest
[ERROR] testGetSuggestionsSimplePath Time elapsed: 0.297 s <<< FAILURE!
java.lang.AssertionError:
Expected: iterable containing ["header"]
but: No item matched: "header"
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at org.opengrok.web.api.v1.controller.SuggesterControllerTest.testGetSuggestionsSimplePath(SuggesterControllerTest.java:220)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.ja
...
... lots of messages about Grizzly HTTP server start and shutdown
Mar 23, 2020 4:11:35 PM org.opengrok.suggest.Suggester shutdownAndAwaitTermination
INFO: Suggesters for [bug16465, archives, data, scala, clojure, document, pascal, bug19195, issue10, sql, rust, csharp, java, fortran, json, perl, tcl, swift, haskell, python, c, kotlin, lisp, rfe2575, bug16070, javascript, script, bug3658, php, scopes, bug14924, bug3430] were successfully rebuilt (took 9 seconds)
We can see this sequence of events:
When TimeUnit.SECONDS.sleep(15);
was added to @BeforeClass
the failures can no longer be seen.
I found out that SuggesterControllerTest
enforces the order of test case execution by using @FixMethodOrder(MethodSorters.NAME_ASCENDING)
. There are 2 test cases that trigger scheduler rebuild - ZtestRebuild
(rebuilds all projects, takes bunch of seconds), ZtestRebuildProject
(rebuilds just the c
project, nearly instantaneous).
When the SuggesterControllerTest
is run in Travis or in IDEA locally on my Linux workstation (with en_US.UTF-8
locale) the Z
tests actually get run first. I commented out the Z
tests and the tests still pass. So, not sure why this ordering is in place, @ahornace ?
Anyhow, either ordering is not good - if the Z
tests get run first they might interfere with the tests in SuggesterControllerTest
(as demonstrated in this issue), if they get to run last they might interfere with tests in another test class since the suggester rebuild thread is asynchronous.