dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.94k stars 4.64k forks source link

ExplicitConversion_FromSingle failing due to NaN != NaN #103347

Open stephentoub opened 3 months ago

stephentoub commented 3 months ago

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=768837&view=results Build error leg or test failing: System.Tests.HalfTests.ExplicitConversion_FromSingle Pull request:

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": "",
  "ErrorPattern": "FAIL.*System.Tests.HalfTests.ExplicitConversion_FromSingle",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: :mag_right: https://dev.azure.com/dnceng-public/public/_build/results?buildId=768837 Error message validated: [FAIL.*System.Tests.HalfTests.ExplicitConversion_FromSingle] Result validation: :white_check_mark: Known issue matched with the provided build. Validation performed at: 8/8/2024 7:39:18 PM UTC

[14:40:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[14:40:21] info: Assert.Equal() Failure: Values differ
[14:40:21] info: Expected:   NaN
[14:40:21] info: Actual:     NaN
[14:40:21] info:    at System.AssertExtensions.Equal(Half expected, Half actual)
[14:40:21] info:    at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[14:40:21] info:    at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[14:40:21] info:    at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
[14:40:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[14:40:21] info: Assert.Equal() Failure: Values differ
[14:40:21] info: Expected:   NaN
[14:40:21] info: Actual:     NaN
[14:40:21] info:    at System.AssertExtensions.Equal(Half expected, Half actual)
[14:40:21] info:    at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[14:40:21] info:    at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[14:40:21] info:    at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)

Report

Build Definition Test Pull Request
806302 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106875
805750 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107724
804271 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107657
802902 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107579
802129 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107434
802053 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107351
801076 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107504
800068 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107368
800023 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#105427
799978 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107464
799428 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107113
798667 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107368
798481 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107395
797874 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107039
795917 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107268
795414 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107241
793973 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107142
792785 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#107149
790678 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#104487
788461 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106965
787239 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106913
787068 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106846
786693 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution
786522 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution
783018 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106629
780436 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106329
779387 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106575
778066 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106048
776528 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106378
775981 dotnet/runtime WasmTestOnChrome-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106408
774776 dotnet/runtime WasmTestOnV8-ST-System.Runtime.Tests.WorkItemExecution dotnet/runtime#106368

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
2 11 31
dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @dotnet/area-infrastructure-libraries See info in area-owners.md if you want to be subscribed.

dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

ericstj commented 1 month ago

This issue is matching to tests jobs which log test names. Need to further constrain the match.

mdh1418 commented 1 month ago

Just hit this in #106040 for WasmTestOnChrome-ST-System.Runtime.Tests config: net9.0-browser-Release-wasm-Mono_Release-WasmTestOnChrome

[18:27:21] info: [STRT] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: Assert.Equal() Failure: Values differ
[18:27:21] info: Expected:   NaN
[18:27:21] info: Actual:     NaN
[18:27:21] info:    at System.AssertExtensions.Equal(Half expected, Half actual)
[18:27:21] info:    at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[18:27:21] info:    at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[18:27:21] info:    at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
[18:27:21] info: [STRT] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: [FAIL] System.Tests.HalfTests.ExplicitConversion_FromSingle(f: NaN, expected: NaN)
[18:27:21] info: Assert.Equal() Failure: Values differ
[18:27:21] info: Expected:   NaN
[18:27:21] info: Actual:     NaN
[18:27:21] info:    at System.AssertExtensions.Equal(Half expected, Half actual)
[18:27:21] info:    at System.Tests.HalfTests.ExplicitConversion_FromSingle(Single f, Half expected)
[18:27:21] info:    at System.Object.InvokeStub_HalfTests.ExplicitConversion_FromSingle(Object , Span`1 )
[18:27:21] info:    at System.Reflection.MethodBaseInvoker.InvokeWithFewArgs(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)

Not sure why it didn't match in build analysis

tannergooding commented 1 month ago

The test here should likely be filtered out on WASM for the time being, with a tracking issue against it.

WASM isn't technically doing anything wrong here, normalizing NaN is fully allowed by the IEEE 754 spec. However, it is undesirable and not recommended for most cases and should typically be avoided if possible.

WASM also has instructions that should guarantee the underlying bits are preserved, so it should be possible for it to be preserved and match the behavior of other targets (both for RyuJIT and Mono).

akoeplinger commented 1 month ago

Hm the weird thing here is that this seems to be an intermittent failure. I've checked and we definitely have passing runs of the test on the exact same config.

tannergooding commented 1 month ago

Is there perhaps a difference in the WASM version or configuration options for the failing platform as compared to others?

Maybe some machine specific issue that's only triggering in some scenarios?

akoeplinger commented 1 month ago

the tests are running in containers on the same helix queue so even the VM/machine should be the same...

Passing run: https://helixre107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-pull-106013-merge-6f4bade3cf974edcb8/WasmTestOnChrome-ST-System.Runtime.Tests/1/console.3fb2056a.log?helixlogtype=result

Failing run: https://helixre107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-pull-106040-merge-0b77eabb49f848cab5/WasmTestOnChrome-ST-System.Runtime.Tests/1/console.5edeb661.log?helixlogtype=result

The only difference is the order of the tests, but I'm having a hard time imagining how that could be relevant

akoeplinger commented 1 month ago

hm one interesting thing is that the failing run reports two tests that failed i.e. five total, but the passing run only ran four NaN tests... that seems just a log quirk, the testResult.xml only contains four NaN tests and two of them failing.

akoeplinger commented 1 month ago

According to Kusto test data this started happening on or before 2024-06-06. I also found that it's happening on Windows-based containers too so it's not related to the underlying OS.

Given it's only happening intermittently and doesn't appear to be a blocker for 9.0 I'm moving it to 10.0

lambdageek commented 2 weeks ago

Couple of things could be responsible which might be explained by the test order:

  1. jiterp might be kicking in
  2. browser JIT might be kicking in

But I'd like to push back against hte idea that bitwise comparison of two NaNs is a good idea. there are many NaN bitpatterns; all are semantically indistinguishable. Reasonable code should not depend on the bit pattern staying identical after computation

tannergooding commented 2 weeks ago

But I'd like to push back against hte idea that bitwise comparison of two NaNs is a good idea. there are many NaN bitpatterns; all are semantically indistinguishable. Reasonable code should not depend on the bit pattern staying identical after computation

This is a core thing frequently depended upon for SIMD, NaN boxing, etc. It also follows the IEEE 754 "recommended" guidelines, even if it's not strictly required by the IEEE 754 spec.

The WASM 1.0 and 2.0 specs similarly:

However, the WASM specs do notably explicitly allow non-determinism for fneg, fabs, and fcopysign; explicitly specifying that the sign and payload of the result are non-deterministic

So while a given implementation is technically allowed to do "other things" (such as always canonicalizing NaN), its highly irregular to do so and that's why we have tests that validate such conversions are preserving bits and propagating payloads as expected. -- It's also notably cheaper to propagate and to preserve NaN as is then it is to canonicalize (particularly for things like fbits), so there's really no reason for a browser to deviate here