dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

New ConfigurationBindingGeneratorTests fail on s390x (endian issue?) #105311

Open uweigand opened 3 months ago

uweigand commented 3 months ago

Since https://github.com/dotnet/runtime/pull/104180 was merged, the following test cases are failing on s390x:

    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind_T_BinderOptions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Collections [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind_Instance_BinderOptions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.BindConfiguration [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Get_T_BinderOptions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind_Instance [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Get_T [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.GetValue_T_Key_DefaultValue [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind_CanParseTargetConfigType_FromMethodParam [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Get_TypeOf [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind_T [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.UnsupportedTypes [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.GetValue_TypeOf_Key [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.BindConfigurationWithConfigureActions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Configure_T_name_BinderOptions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.GetValue_T_Key [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.MinimalGenerationIfNoBindableMembers [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind_Key_Instance [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Configure_T [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.DefaultConstructorParameters [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.GetValue_TypeOf_Key_DefaultValue [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Get_TypeOf_BinderOptions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Primitives [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Configure_T_name [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Configure_T_BinderOptions [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Get_PrimitivesOnly [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.GetValue [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Get [FAIL]
    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind [FAIL]

The symptom is in all cases of the type:

    Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind [FAIL]
      Line 38 does not match.
      Expected Line:
              [InterceptsLocation(1, "/TzDbopkyui/vWzNJfmpq0IBAABzcmMtMC5jcw==")] // src-0.cs(12,14)
      Actual Line:
              [InterceptsLocation(1, "6sAEYxbTvjjW269lKa/ayUIBAABzcmMtMC5jcw==")] // src-0.cs(12,14)
      Stack Trace:
        /home/uweigand/runtime/src/libraries/Microsoft.Extensions.Configuration.Binder/tests/SourceGenerationTests/GeneratorTests.Helpers.cs(175,0): at Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.VerifyAgainstBaselineUsingFile(String filename, String testSourceCode, ExtensionClassType extType, ExpectedDiagnostics expectedDiags)
        /home/uweigand/runtime/src/libraries/Microsoft.Extensions.Configuration.Binder/tests/SourceGenerationTests/GeneratorTests.Baselines.cs(18,0): at Microsoft.Extensions.SourceGeneration.Configuration.Binder.Tests.ConfigurationBindingGeneratorTests.Bind()

It looks like the difference is in the InterceptableLocationData, which is provided by roslyn's Microsoft.CodeAnalysis.CSharp.InterceptableLocation1.Data. Specifically, the differences are in the checksum field, which is computed via the SourceText:GetContentHash() routine. From looking this this routine, it seems to simply re-interpret a string (char sequence) as byte array, see https://github.com/dotnet/roslyn/blob/aea9e82da403c397265f7fd0fefee5ebbb886179/src/Compilers/Core/Portable/Text/SourceText.cs#L645

hash.Append(MemoryMarshal.AsBytes(charBuffer.AsSpan(0, charsToCopy)));

This seems to imply that the hash value of the same source code will be different if computed on a big-endian vs. a little-endian machine (because the byte representation of each char will have its two bytes swapped).

Now my question would be: is the content hash supposed to be the same on big- and little-endian machines? If yes, that code in roslyn needs to be fixed. If no, then the ConfigurationBindingGeneratorTests will have to take that endian difference into account one way or the other.

CC @steveharter @tarekgh @directhex @steveisok FIY @giritrivedi @saitama951 @omajid @tmds

dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @dotnet/area-extensions-configuration See info in area-owners.md if you want to be subscribed.

Vishwanatha-HD commented 3 months ago

Attaching the complete log file of the error which got reproduced.. NET_runtime_Binder_Test_errors.txt

tarekgh commented 3 months ago

@steveharter is this something you can help looking at?

steveharter commented 3 months ago

@RikkiGibson PTAL

steveharter commented 3 months ago

This is similar to the question raised in https://github.com/dotnet/runtime/issues/101079#issuecomment-2221225812 where line endings can also vary based on platform.

RikkiGibson commented 3 months ago

I think we expect that if the file content is the same, these hashes should be the same.

@CyrusNajmabadi is it expected that the result of GetContentHash() will vary depending on endianness?

CyrusNajmabadi commented 3 months ago

It should be based entirely on the character content of the source files. It would be good if both were dumped out (literally every character and its integral value), and the two were compared.

If they have the same characters, then GetContentHash should be the same.

If they have different characters, then this is by design.

Are we sure this isn't an issue around \r vs \r\n?

uweigand commented 3 months ago

It should be based entirely on the character content of the source files. It would be good if both were dumped out (literally every character and its integral value), and the two were compared.

If they have the same characters, then GetContentHash should be the same.

I believe they have the same sequence of characters but not the same sequence of bytes, because each character has two bytes, and the order of those two bytes within a single character is endian-dependent. The current implementation of GetContentHash only returns the same value on the same sequence of bytes, not characters.

steveharter commented 3 months ago

@uweigand I think it makes sense to disable these tests for now as any possible Roslyn change will take a while to come over - what CI pipelines do I need to run to verify? (is it "runtime-extra-platforms"?)

directhex commented 3 months ago

Is there a way to ensure hashes like this are platform-independent without introducing a big byte swap performance overhead on big endian?

uweigand commented 3 months ago

@uweigand I think it makes sense to disable these tests for now as any possible Roslyn change will take a while to come over - what CI pipelines do I need to run to verify? (is it "runtime-extra-platforms"?)

The runtime-community pipeline runs on s390x, however this is unfortunately not operational at the moment. @directhex is working on bringing this back up again. For the time being, I'm happy to test any suggested fixes on our machines.

steveharter commented 3 months ago

@uweigand I put up a PR to disable these tests on https://github.com/dotnet/runtime/pull/105485. Can you give it a try to make sure all of the tests pass now? Thanks

uweigand commented 3 months ago

@uweigand I put up a PR to disable these tests on #105485. Can you give it a try to make sure all of the tests pass now? Thanks

Yes, this makes all tests pass again. Thanks!

steveharter commented 3 months ago

It should be based entirely on the character content of the source files. It would be good if both were dumped out (literally every character and its integral value), and the two were compared.

If they have the same characters, then GetContentHash should be the same.

@CyrusNajmabadi my take is that Roslyn may want to calculate the hash on character content, not byte contents. Also, if that is done, and we take a perf hit, does it make sense at that time to normalize line endings so that files saved under Windows\Linux\OSX all have the same hash?

RikkiGibson commented 3 months ago

At this point the line-endings issue is something we are comfortable with it behaving as it currently does. The file content is different, so the hash is different.

The endianness issue may be a real bug from our POV though. The content is the same character-for-character, it is just laid out differently in memory due to an architectural difference. It would be good if we could adjust the way the content hash is calculated on big-endian, so that it matches little-endian, without sacrificing too much perf.

I will also say that while the intended design is to allow [InterceptsLocation] attributes to be written out to disk and consumed on another machine etc. it is not actually a mainline case for us. So, it is prone to hit bugs like these.

It might make sense to adjust the creation+consumption of these baselines to basically be insensitive to exactly what attribute arguments are used, and to add (or rely on pre-existing) functional verification that the compilation is successful and the semantics match what you expect.

@jaredpar for visibility.

CyrusNajmabadi commented 3 months ago

does it make sense at that time to normalize line endings so that files saved under Windows\Linux\OSX all have the same hash?

Under no circumstances would we ever do this. Line ending is content and affects the runtime behavior of a program. Two programs with different line endings should not have the same hash as that would indicate they were identical.

steveharter commented 2 months ago

@CyrusNajmabadi @RikkiGibson is there a tracking Roslyn issue?

Do you expect this to be fixed in the v9.0 timeframe or should I move this to 10.0?

RikkiGibson commented 2 months ago

No, feel free to file an issue on Roslyn for the endianness behavior. I am guessing a fix in 9.0 is unlikely at this point.

steveharter commented 2 months ago

Roslyn issue: https://github.com/dotnet/roslyn/issues/74682

I am guessing a fix in 9.0 is unlikely at this point.

I am moving this to v10.