eXist-db / exist-xqts-runner

W3C XQTS driver for eXist-db
GNU Lesser General Public License v3.0
5 stars 6 forks source link

Confirm results are reproducible #19

Open joewiz opened 3 years ago

joewiz commented 3 years ago

On yesterday's Community Call we discussed evidence indicating users obtained different results despite using identical versions of the exist-xqts-runner and command line flags. I propose we use the latest directions derived from https://github.com/eXist-db/exist-xqts-runner/pull/17 and gather results from as many users as possible:

  1. git clone https://github.com/exist-db/exist-xqts-runner.git (ensure a fresh clone of the master branch, no local modifications)
  2. cd exist-xqts-runner
  3. sbt assembly
  4. target/scala-2.13/exist-xqts-runner-assembly-1.0.0.jar -x HEAD
  5. open target/junit/html/index.html and copy and paste the "Summary" table into your reply to this issue. GitHub is smart enough to transform your HTML into GFM, no fiddling needed. For example, here are my results:
Tests Failures Errors Skipped Success rate Time
31557 5167 551 1260 81.88% 439.974
adamretter commented 3 years ago

My laptop:

Results:

Iteration # Tests Failures Errors Skipped Success rate Time
1 31557 5157 551 1260 81.91% 800.824
2 31557 5156 551 1260 81.92% 1556.271
3 31557 5157 551 1260 81.91% 791.146

EB Ubuntu Dev Env:

Results:

Iteration # Tests Failures Errors Skipped Success rate Time
1 31557 5171 551 1260 81.87% 626.455
2 31557 5170 551 1260 81.87% 810.823
3 31557 5172 551 1260 81.86% 616.676

NOTE: The difference between the test runs on my two machines and that of @joewiz appears to be in the number of Failures, all other numbers are consistent. I think the next things to test are:

  1. If we each run several times on the same machine, do we get the same results?
  2. We need to compare individual failures
    1. Where the test failures vary on the same machine, as the XQTS is highly concurrent I suspect a concurrency issue (most likely in eXist-db as opposed to XQTS).
    2. Where the test failures vary between different machines, this could be the same concurrency issue, or if the results are dramatically different then I would suspect a machine specific environment issue...
duncdrum commented 3 years ago

Desktop

Results

Iteration Tests Failures Errors Skipped Success rate Time
1 31557 5159 551 1260 81.91% 523.675
2 31557 5160 551 1260 81.90% 661.144
3 31557 5159 551 1260 81.91% 516.675
marmoure commented 3 years ago

Desktop Windows: 10 Pro 19043.906 java: 1.8.0_281 sbt: 1.3.3 / 1.3.3

Iteration Tests Failures Errors Skipped Success rate Time
1 31557 5161 551 1260 81.91% 699.980
2 31557 5159 551 1260 81.91% 688.543
3 31557 5159 551 1260 81.91% 715.330
joewiz commented 3 years ago

My iMac

Results

Iteration Tests Failures Errors Skipped Success rate Time
1 31557 5166 551 1260 81.88% 454.876
2 31557 5168 551 1260 81.88% 470.561
3 31557 5166 551 1260 81.88% 495.145

This matches Adam's observation that the Failures alternate between 5166 and 5168, while all other non-Time values remain constant.

The variation in Failures for all of the reported results in this issue is always 0, 1, or 2....

joewiz commented 3 years ago

In https://github.com/eXist-db/exist/pull/3966#issuecomment-890448844 I performed a similar batch of runs of exist-xqts-runner. In the 3 runs for that PR, I saw different results within the 3 PR runs, similar to the +/- 0-3 differences we saw here.

In test 1 of the PR, there were 5,162 failures, but in test 2 a few minutes later, there were 3 fewer failures - 5,159 failures. The 3 differences all occurred within the tests for regular expressions in the fn:matches function:

re00062

Test:

   <test-case name="re00062">
      <description>Test regex syntax</description>
      <created by="Michael Kay" on="2011-07-04"/>
      <test>(every $s in tokenize('', ',') satisfies matches($s, '^(?:[^\p{IsBasicLatin}]*)$')) and (every $s in tokenize('a', ',') satisfies not(matches($s, '^(?:[^\p{IsBasicLatin}]*)$')))</test>
      <result>
         <assert-true/>
      </result>
   </test-case>

Failure in 1st test that passed in 2nd test:

Expected: 'AssertTrue', but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression "^(?:[^\p{IsBasicLatin}]*)$": invalid block name (BasicLatin) [at line 1, column 135])

Stacktrace:

junit.framework.AssertionFailedError: Expected: &apos;AssertTrue&apos;, but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&amp;O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression &quot;^(?:[^\p{IsBasicLatin}]*)$&quot;: invalid block name (BasicLatin) [at line 1, column 135])
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
    at scala.collection.immutable.List.foreach(List.scala:333)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
    at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
    at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
    at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
    at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
    at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
    at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
    at cats.effect.IO.unsafeRunAsync(IO.scala:274)
    at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
    at cats.effect.IO.unsafeRunTimed(IO.scala:342)
    at cats.effect.IO.unsafeRunSync(IO.scala:256)
    at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
    at akka.actor.Actor.aroundReceive(Actor.scala:537)
    at akka.actor.Actor.aroundReceive$(Actor.scala:535)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
    at akka.actor.ActorCell.invoke(ActorCell.scala:548)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
    at akka.dispatch.Mailbox.run(Mailbox.scala:231)
    at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

re00225

Test:

   <test-case name="re00225">
      <description>Test regex syntax</description>
      <created by="Michael Kay" on="2011-07-04"/>
      <test>(every $s in tokenize('&#1536;&#1791;,&#1536;&#1537;&#1538;&#1539;&#1540;&#1541;&#1542;&#1543;&#1544;&#1545;&#1546;&#1547;&#1548;&#1549;&#1550;&#1551;&#1552;&#1553;&#1554;&#1555;&#1556;&#1557;&#1558;&#1559;&#1560;&#1561;&#1562;&#1563;&#1564;&#1565;&#1566;&#1567;&#1568;&#1569;&#1570;&#1571;&#1572;&#1573;&#1574;&#1575;&#1576;&#1577;&#1578;&#1579;&#1580;&#1581;&#1582;&#1583;&#1584;&#1585;&#1586;&#1587;&#1588;&#1589;&#1590;&#1591;&#1592;&#1593;&#1594;&#1595;&#1596;&#1597;&#1598;&#1599;&#1600;&#1601;&#1602;&#1603;&#1604;&#1605;&#1606;&#1607;&#1608;&#1609;&#1610;&#1611;&#1612;&#1613;&#1614;&#1615;&#1616;&#1617;&#1618;&#1619;&#1620;&#1621;&#1622;&#1623;&#1624;&#1625;&#1626;&#1627;&#1628;&#1629;&#1630;&#1631;&#1632;&#1633;&#1634;&#1635;&#1636;&#1637;&#1638;&#1639;&#1640;&#1641;&#1642;&#1643;&#1644;&#1645;&#1646;&#1647;&#1648;&#1649;&#1650;&#1651;&#1652;&#1653;&#1654;&#1655;&#1656;&#1657;&#1658;&#1659;&#1660;&#1661;&#1662;&#1663;&#1664;&#1665;&#1666;&#1667;&#1668;&#1669;&#1670;&#1671;&#1672;&#1673;&#1674;&#1675;&#1676;&#1677;&#1678;&#1679;&#1680;&#1681;&#1682;&#1683;&#1684;&#1685;&#1686;&#1687;&#1688;&#1689;&#1690;&#1691;&#1692;&#1693;&#1694;&#1695;&#1696;&#1697;&#1698;&#1699;&#1700;&#1701;&#1702;&#1703;&#1704;&#1705;&#1706;&#1707;&#1708;&#1709;&#1710;&#1711;&#1712;&#1713;&#1714;&#1715;&#1716;&#1717;&#1718;&#1719;&#1720;&#1721;&#1722;&#1723;&#1724;&#1725;&#1726;&#1727;&#1728;&#1729;&#1730;&#1731;&#1732;&#1733;&#1734;&#1735;&#1736;&#1737;&#1738;&#1739;&#1740;&#1741;&#1742;&#1743;&#1744;&#1745;&#1746;&#1747;&#1748;&#1749;&#1750;&#1751;&#1752;&#1753;&#1754;&#1755;&#1756;&#1757;&#1758;&#1759;&#1760;&#1761;&#1762;&#1763;&#1764;&#1765;&#1766;&#1767;&#1768;&#1769;&#1770;&#1771;&#1772;&#1773;&#1774;&#1775;&#1776;&#1777;&#1778;&#1779;&#1780;&#1781;&#1782;&#1783;&#1784;&#1785;&#1786;&#1787;&#1788;&#1789;&#1790;&#1791;', ',') satisfies matches($s, '^(?:\p{IsArabic}+)$')) and (every $s in tokenize('', ',') satisfies not(matches($s, '^(?:\p{IsArabic}+)$')))</test>
      <result>
         <assert-true/>
      </result>
   </test-case>

Failure in 1st test that passed in 2nd test:

Expected: 'AssertTrue', but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 16 in regular expression "^(?:\p{IsArabic}+)$": invalid block name (Arabic) [at line 1, column 301])

Stacktrace:

junit.framework.AssertionFailedError: Expected: &apos;AssertTrue&apos;, but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&amp;O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 16 in regular expression &quot;^(?:\p{IsArabic}+)$&quot;: invalid block name (Arabic) [at line 1, column 301])
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
    at scala.collection.immutable.List.foreach(List.scala:333)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
    at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
    at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
    at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
    at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
    at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
    at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
    at cats.effect.IO.unsafeRunAsync(IO.scala:274)
    at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
    at cats.effect.IO.unsafeRunTimed(IO.scala:342)
    at cats.effect.IO.unsafeRunSync(IO.scala:256)
    at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
    at akka.actor.Actor.aroundReceive(Actor.scala:537)
    at akka.actor.Actor.aroundReceive$(Actor.scala:535)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
    at akka.actor.ActorCell.invoke(ActorCell.scala:548)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
    at akka.dispatch.Mailbox.run(Mailbox.scala:231)
    at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

re00061

Test:

   <test-case name="re00061">
      <description>Test regex syntax</description>
      <created by="Michael Kay" on="2011-07-04"/>
      <test>(every $s in tokenize('&#256;', ',') satisfies matches($s, '^(?:[^\p{IsBasicLatin}]+)$')) and (every $s in tokenize('', ',') satisfies not(matches($s, '^(?:[^\p{IsBasicLatin}]+)$')))</test>
      <result>
         <assert-true/>
      </result>
   </test-case>

Failure in 1st test that passed in 2nd test:

Expected: 'AssertTrue', but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression "^(?:[^\p{IsBasicLatin}]+)$": invalid block name (BasicLatin) [at line 1, column 43])

Stacktrace:

junit.framework.AssertionFailedError: Expected: &apos;AssertTrue&apos;, but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&amp;O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression &quot;^(?:[^\p{IsBasicLatin}]+)$&quot;: invalid block name (BasicLatin) [at line 1, column 43])
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
    at scala.collection.immutable.List.foreach(List.scala:333)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
    at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
    at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
    at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
    at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
    at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
    at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
    at cats.effect.IO.unsafeRunAsync(IO.scala:274)
    at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
    at cats.effect.IO.unsafeRunTimed(IO.scala:342)
    at cats.effect.IO.unsafeRunSync(IO.scala:256)
    at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
    at akka.actor.Actor.aroundReceive(Actor.scala:537)
    at akka.actor.Actor.aroundReceive$(Actor.scala:535)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
    at akka.actor.ActorCell.invoke(ActorCell.scala:548)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
    at akka.dispatch.Mailbox.run(Mailbox.scala:231)
    at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

I can't speculate why two runs of exist-xqts-runner run a few minutes apart would produce invalid block name errors in one run but not the next. But these variations do not appear to be related to the PR I was investigating.

Tests 2 vs. 3 differed only by 1 test, and this one was in a different location:

group-015

Test:

   <test-case name="group-015">
      <description>No value comparisons are available to compare the grouping keys.</description>
      <created by="Josh Spiegel" on="2012-10-02"/>
      <modified by="Michael Kay" on="2017-03-17" change="avoid assert-xml for non-XML results"/>
      <test>
          for $x in (true(), "true", xs:QName("true"))
          group by $x
          return $x
      </test>
      <result>
        <assert-permutation>true(), "true", xs:QName("true")</assert-permutation>
      </result>
   </test-case>

The failure in test 3 that passed in test 2:

assert-permutation: expected='true(), "true", xs:QName("true")', actual='Q{}true, "true", true()'" type="junit.framework.AssertionFailedError

Stacktrace:

junit.framework.AssertionFailedError: assert-permutation: expected=&apos;true(), &quot;true&quot;, xs:QName(&quot;true&quot;)&apos;, actual=&apos;Q{}true, &quot;true&quot;, true()&apos;
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
    at scala.collection.immutable.List.foreach(List.scala:333)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
    at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
    at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
    at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
    at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
    at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
    at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
    at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
    at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
    at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
    at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
    at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
    at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
    at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
    at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
    at cats.effect.IO.unsafeRunAsync(IO.scala:274)
    at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
    at cats.effect.IO.unsafeRunTimed(IO.scala:342)
    at cats.effect.IO.unsafeRunSync(IO.scala:256)
    at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
    at akka.actor.Actor.aroundReceive(Actor.scala:537)
    at akka.actor.Actor.aroundReceive$(Actor.scala:535)
    at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
    at akka.actor.ActorCell.invoke(ActorCell.scala:548)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
    at akka.dispatch.Mailbox.run(Mailbox.scala:231)
    at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

Comparing test 1 to test 3, all 4 of the exact same differences above were the causes of the differences.

This would explain the consistent range of variation of 0-3 in the results that we all reported:

Note that we didn't see a variation of 4—for the case where 1 test failed both the 1 GroupByClause test and the 3 matches.re.xml tests. Perhaps we'd see this if we performed more test runs, or perhaps the two groups of failing tests don't occur in the same run, i.e., they're inter-related?

This is just a running theory. Perhaps there are other tests that fail besides these, and only additional runs and comparisons would reveal them.

To check which testsuites were responsible for the difference between 2 test runs, save the target/junit/data/TESTS-TestSuites.xml files from each run, and then run this query - providing the $tss-1 and $tss-2 paths to the two files:

xquery version "3.1";

let $tss1 := doc("/db/apps/exist-xqts-results/data/5.4.0-SNAPSHOT-with-Juri-PR/test01/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
let $tss2 := doc("/db/apps/exist-xqts-results/data/5.4.0-SNAPSHOT-with-Juri-PR/test03/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
return
    array {
        for $ts1 in $tss1
        let $ts1-failures := $ts1/@failures
        let $ts2 := $tss2[@package eq $ts1/@package and @name eq $ts1/@name]
        let $ts2-failures := $ts2/@failures
        return
            if ($ts1-failures ne $ts2-failures) then
                map {
                    "package": $ts1/@package/string(),
                    "name": $ts1/@name/string(),
                    "ts1-failures": $ts1/@failures cast as xs:integer,
                    "ts2-failures": $ts2/@failures cast as xs:integer
                }
            else
                ()
    }

This returns a result like:

[
    {
        "package": "XQTS_HEAD.fn-matches",
        "name": "re",
        "ts1-failures": 7,
        "ts2-failures": 4
    },
    {
        "package": "XQTS_HEAD",
        "name": "prod-GroupByClause",
        "ts1-failures": 15,
        "ts2-failures": 16
    }
]

To derive the table like the one I posted in the PR comment linked above, which listed the tests that returned different results in 2 test runs, I uploaded the entire junit directories to eXist and ran the following query:

xquery version "3.1";

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method "html5";
declare option output:media-type "text/html";

declare function local:compare-testcase($testcase-1, $testcase-2) {
    element tr {
        element td { $testcase-1/../@package/string() },
        element td { $testcase-1/../@name/string() },
        element td { $testcase-1/@name/string() },
        element td { ($testcase-1/*/name(), "pass")[. ne ""][1] },
        element td { ($testcase-2/*/name(), "pass")[. ne ""][1] }
    }
};

declare function local:compare-testcases($testcases-1, $testcases-2) {
    for $tc1 in $testcases-1
    let $name := $tc1/@name
    let $tc2 := $testcases-2[@name eq $name]
    order by $name
    return
        if (
                (empty($tc1/node()) and empty($tc2/node()))
                or 
                ($tc1/error and $tc2/error)
                or 
                ($tc1/failure and $tc2/failure)
                or 
                ($tc1/skipped and $tc2/skipped)
            ) then
            ()
        else
            local:compare-testcase($tc1, $tc2)
};

declare function local:compare-testsuites($testsuites-1, $testsuites-2) {
    element table {
        element thead {
            element tr {
                element th { "testsuite package" },
                element th { "testsuite name" },
                element th { "testcase name" },
                element th { "test 1" },
                element th { "test 2" }
            }
        },
        element tbody {
            for $ts1 in $testsuites-1
            let $package := $ts1/@package
            let $name := $ts1/@name
            let $ts2 := $testsuites-2[@package eq $package and @name eq $name]
            order by $package, $name
            return 
                if ($ts1/@errors eq "0" and $ts2/@errors eq "0") then
                    ()
                else
                    local:compare-testcases($ts1/testcase, $ts2/testcase)
        }
    }
};

let $data-collection := "/db/apps/exist-xqts-results/data"
let $testsuites-1 := 
    doc($data-collection || "/5.4.0-SNAPSHOT-before-Juri-PR/test01/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
let $testsuites-2 := 
    doc($data-collection || "/5.4.0-SNAPSHOT-with-Juri-PR/test01/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
return
    local:compare-testsuites($testsuites-1, $testsuites-2)

... returns a table like this:

testsuite package testsuite name testcase name test 1 test 2
XQTS_HEAD prod-UnaryLookup UnaryLookup-011 failure error
XQTS_HEAD prod-UnaryLookup UnaryLookup-015 failure pass
XQTS_HEAD prod-UnaryLookup UnaryLookup-016 failure error
XQTS_HEAD prod-UnaryLookup UnaryLookup-017 failure error
XQTS_HEAD prod-UnaryLookup UnaryLookup-023 failure pass
XQTS_HEAD prod-UnaryLookup UnaryLookup-025 failure pass
XQTS_HEAD prod-UnaryLookup UnaryLookup-046 failure pass

I hope these results and queries help us nail down the sources of unexpected variation in the results of exist-xqts-runner.

dizzzz commented 3 years ago

impressive research and analysis !

dizzzz commented 3 years ago

Order variations in results sound like... usage of a Hashmap somewhere.