Open epag opened 3 weeks ago
Original Redmine Comment Author Name: James (James) Original Date: 2024-01-11T11:26:38Z
Bizarre one this, seemingly unrelated to the associated commit and not failing locally.
2024-01-10T19:39:04.134+0000 ERROR Scenario303 testScenario(wres.systests.Scenario303)
junit.framework.AssertionFailedError: Comparison with benchmarks failed with code 32. expected:<0> but was:<32>
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.failNotEquals(Assert.java:329)
at junit.framework.Assert.assertEquals(Assert.java:78)
at junit.framework.Assert.assertEquals(Assert.java:234)
at junit.framework.TestCase.assertEquals(TestCase.java:377)
at wres.systests.ScenarioHelper.assertOutputsMatchBenchmarks(ScenarioHelper.java:220)
at wres.systests.Scenario303.testScenario(Scenario303.java:82)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:108)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:57)
at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:39)
at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:52)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
at jdk.proxy2/jdk.proxy2.$Proxy30.processTestClass(Unknown Source)
at org.gradle.api.internal.tasks.testing.worker.TestWorker$2.run(TestWorker.java:176)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.executeAndMaintainThreadName(TestWorker.java:129)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:100)
at org.gradle.api.internal.tasks.testing.worker.TestWorker.execute(TestWorker.java:60)
at org.gradle.process.internal.worker.child.ActionExecutionWorker.execute(ActionExecutionWorker.java:56)
at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:113)
at org.gradle.process.internal.worker.child.SystemApplicationClassLoaderWorker.call(SystemApplicationClassLoaderWorker.java:65)
at worker.org.gradle.process.internal.worker.GradleWorkerMain.run(GradleWorkerMain.java:69)
at worker.org.gradle.process.internal.worker.GradleWorkerMain.main(GradleWorkerMain.java:74)
Will need to get the actual output to see how it differs.
Original Redmine Comment Author Name: James (James) Original Date: 2024-01-11T11:27:16Z
Original Redmine Comment Author Name: James (James) Original Date: 2024-01-11T11:28:34Z
Hank, could you grab this and post here?
2024-01-10T19:39:04.099+0000 WARN ScenarioHelper The metric CSV file differs from /wres_share/releases/systests-20231130-964129c/scenario303/benchmarks/LGNN5_LGNN5_HEFS_MEAN_ERROR.csv (result code 32) for file with name LGNN5_LGNN5_HEFS_MEAN_ERROR.csv
Original Redmine Comment Author Name: Hank (Hank) Original Date: 2024-01-11T12:14:11Z
I'll do so as soon as I can. Still catching up on emails,
Hank
Original Redmine Comment Author Name: Hank (Hank) Original Date: 2024-01-11T12:26:39Z
First challenge: getting the complete file name of the CSV file being compared with the benchmark. There are quite a few in the @outputs@ folder that match "LGNN5_LGNN5_HEFS_MEAN_ERROR.csv". Looking at the log,
Hank
Original Redmine Comment Author Name: Hank (Hank) Original Date: 2024-01-11T12:37:05Z
Found the evaluation identifier in the first @INFO@ line shown below:
2024-01-10T19:39:04.086+0000 INFO EvaluationUtilities The messager for evaluation fZ_hvDuK39Tl1PH0saDf1zr_woU has been closed.
2024-01-10T19:39:04.093+0000 INFO Scenario303 Checking expected file names against actual file names that exist for 3 files...
2024-01-10T19:39:04.094+0000 INFO Scenario303 Finished checking file names. The actual file names match the expected file names.
2024-01-10T19:39:04.095+0000 INFO ScenarioHelper Asserting that outputs match benchmarks for scenario303...
2024-01-10T19:39:04.099+0000 WARN ScenarioHelper The metric CSV file differs from /wres_share/releases/systests-20231130-964129c/scenario303/benchmarks/LGNN5_LGNN5_HEFS_MEAN_ERROR.csv (result code 32) for file with name LGNN5_LGNN5_HEFS_MEAN_ERROR.csv
That identifier allowed me to find the output in @/wres_share/releases/systests-20231130-964129c/outputs/wres_evaluation_fZ_hvDuK39Tl1PH0saDf1zr_woU@. Here is the difference:
[Hank@nwcal-wres-ti01 wres_evaluation_fZ_hvDuK39Tl1PH0saDf1zr_woU]$ diff LGNN5_LGNN5_HEFS_MEAN_ERROR.csv /wres_share/releases/systests-20231130-964129c/scenario303/benchmarks/LGNN5_LGNN5_HEFS_MEAN_ERROR.csv
8c8
< LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,151200,151200,0.005737,-0.496730,-1.090937,-2.134435
---
> LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,151200,151200,0.005737,-0.496730,-1.090938,-2.134435
Its the last decimal place of the second to last statistic: -1.090937 vs. -1.090938. The entire output file is below (since its small).
Thanks,
Hank
=============================================================
FEATURE DESCRIPTION,EARLIEST ISSUE TIME,LATEST ISSUE TIME,EARLIEST VALID TIME,LATEST VALID TIME,EARLIEST LEAD TIME IN SECONDS [UNKNOWN OVER PAST 21600 SECONDS],LATEST LEAD TIME IN SECONDS [UNKNOWN OVER PAST 21600 SECONDS],MEAN ERROR All data,MEAN ERROR > 0.0 MM [Pr = 0.75],MEAN ERROR > 0.23 MM [Pr = 0.9],MEAN ERROR > 1.24 MM [Pr = 0.95]
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,21600,21600,-0.127101,-1.107134,-2.155417,-3.792708
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,43200,43200,-0.083818,-0.505783,-1.155243,-5.946354
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,64800,64800,-0.025635,-0.671685,-1.282309,-2.216042
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,86400,86400,-0.299960,-4.298375,-5.573420,-7.661227
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,108000,108000,-0.123269,-1.124710,-2.010443,-3.610833
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,129600,129600,-0.056109,-0.463392,-1.192157,-5.733490
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,151200,151200,0.005737,-0.496730,-1.090937,-2.134435
LGNN5,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,-1000000000-01-01T00:00:00Z,+1000000000-12-31T23:59:59.999999999Z,172800,172800,-0.230831,-3.963556,-5.807118,-7.664792
Original Redmine Comment Author Name: James (James) Original Date: 2024-02-16T15:25:00Z
Curious one, but it will need to wait.
Original Redmine Comment Author Name: Evan (Evan) Original Date: 2024-03-19T14:19:07Z
6.21 is going to be just a docker deploy, moving all 6.21 tickets to 6.22
Original Redmine Comment Author Name: Hank (Hank) Original Date: 2024-05-10T17:06:01Z
What's the status of this? Move to 6.23 or the backlog?
Hank
Original Redmine Comment Author Name: James (James) Original Date: 2024-05-10T17:09:19Z
The status is that it's unfixed. We could drop the d.p. on these benchmarks:
decimal_format: '#0.000000'
</code>
However, I wouldn't really expect a difference at 6 d.p. Hey ho.
Original Redmine Comment Author Name: Evan (Evan) Original Date: 2024-05-21T16:21:06Z
Moving this to 6.24, 6.23 is going to be a docker only deploy
Original Redmine Comment Author Name: Evan (Evan) Original Date: 2024-07-02T13:18:10Z
No work done on this in this sprint
The issue described in this ticket cropped up when updating dependencies to research #68 and support #313 (6.26). Just noting that here,
Hank
Author Name: James (James) Original Redmine Issue: 125130, https://vlab.noaa.gov/redmine/issues/125130 Original Date: 2024-01-11
Given a system test of scenario303 at nwcal When the test completes Then I expect it to succeed and not fail with an exception on a benchmark comparison