dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.42k stars 4.75k forks source link

Assertion failed 'regArgTab[argNum].slot == 2' #83167

Closed MichalStrehovsky closed 1 year ago

MichalStrehovsky commented 1 year ago

Seen in https://dev.azure.com/dnceng-public/public/_build/results?buildId=197177&view=logs&j=2f6a7d26-0d60-5ade-d191-981fe0847989&t=950a5a68-ae8a-5bb4-8cad-fb293bb99dc4&l=6041

(linux.arm64.Checked leg of runtime-extra-platforms)

  ILC: /__w/1/s/src/coreclr/jit/codegencommon.cpp:3957
  ILC: Assertion failed 'regArgTab[argNum].slot == 2' in 'System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Plane]:GetHashCode(System.Numerics.Plane):int:this' during 'Generate code' (IL size 24; hash 0x7039dd06; FullOpts)

  Aborted (core dumped)
/__w/1/s/artifacts/bin/coreclr/linux.arm64.Checked/build/Microsoft.NETCore.Native.targets(266,5): error MSB3073: The command ""/__w/1/s/artifacts/bin/coreclr/linux.arm64.Checked/x64/ilc/ilc" @"/__w/1/s/artifacts/obj/System.Numerics.Vectors.Tests/Release/net8.0/native/System.Numerics.Vectors.Tests.ilc.rsp"" exited with code 134. [/__w/1/s/src/libraries/System.Numerics.Vectors/tests/System.Numerics.Vectors.Tests.csproj]

Looks to be related to hardware intrinsics. Cc @tannergooding

ghost commented 1 year ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak See info in area-owners.md if you want to be subscribed.

Issue Details
Seen in https://dev.azure.com/dnceng-public/public/_build/results?buildId=197177&view=logs&j=2f6a7d26-0d60-5ade-d191-981fe0847989&t=950a5a68-ae8a-5bb4-8cad-fb293bb99dc4&l=6041 (linux.arm64.Checked leg of runtime-extra-platforms) ``` ILC: /__w/1/s/src/coreclr/jit/codegencommon.cpp:3957 ILC: Assertion failed 'regArgTab[argNum].slot == 2' in 'System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Plane]:GetHashCode(System.Numerics.Plane):int:this' during 'Generate code' (IL size 24; hash 0x7039dd06; FullOpts) Aborted (core dumped) /__w/1/s/artifacts/bin/coreclr/linux.arm64.Checked/build/Microsoft.NETCore.Native.targets(266,5): error MSB3073: The command ""/__w/1/s/artifacts/bin/coreclr/linux.arm64.Checked/x64/ilc/ilc" @"/__w/1/s/artifacts/obj/System.Numerics.Vectors.Tests/Release/net8.0/native/System.Numerics.Vectors.Tests.ilc.rsp"" exited with code 134. [/__w/1/s/src/libraries/System.Numerics.Vectors/tests/System.Numerics.Vectors.Tests.csproj] ``` Looks to be related to hardware intrinsics. Cc @tannergooding
Author: MichalStrehovsky
Assignees: -
Labels: `area-CodeGen-coreclr`
Milestone: -
tannergooding commented 1 year ago

Will take a look. It's possible its related to the big ZMM support PR that went in yesterday

tannergooding commented 1 year ago

Sorry for the delay. I believe this was fixed by one of @kunalspathak PRs fixing various issues that had cropped up on Arm64.

It is not repro'ing anymore and was not ZMM related given it was Arm64

MichalStrehovsky commented 1 year ago

There is a test exclusion for this in the repo.

MichalStrehovsky commented 1 year ago

@tannergooding this is still broken in main, see CI results in #86366:

  Generating native code
  ILC: /__w/1/s/src/coreclr/jit/codegencommon.cpp:3930
  ILC: Assertion failed 'regArgTab[argNum].slot == 2' in 'System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Plane]:GetHashCode(System.Numerics.Plane):int:this' during 'Generate code' (IL size 24; hash 0x7039dd06; FullOpts)
kunalspathak commented 1 year ago

@MichalStrehovsky - I think we should keep the issue open and mark it with test-disabled or equivalent.

MichalStrehovsky commented 1 year ago

Yeah, github autoclosed it because one of the commits in the PR still referenced this issue.

tannergooding commented 1 year ago

@MichalStrehovsky, there are other failures blocking this from repro'ing: https://github.com/dotnet/runtime/pull/90184

tannergooding commented 1 year ago

For reference, the failure:

EXEC : error : Failed to load assembly 'SQLitePCLRaw.core' [/__w/1/s/src/libraries/System.Private.CoreLib/tests/IntrinsicsInSystemPrivatecoreLibAnalyzer.Tests/IntrinsicsInSystemPrivateCoreLib.Tests.csproj]
##[error]EXEC(0,0): error : (NETCORE_ENGINEERING_TELEMETRY=Build) Failed to load assembly 'SQLitePCLRaw.core'
  Internal.TypeSystem.TypeSystemException+FileNotFoundException: Failed to load assembly 'SQLitePCLRaw.core'
     at Internal.TypeSystem.ThrowHelper.ThrowFileNotFoundException(ExceptionStringID id, String fileName) in /_/src/coreclr/tools/Common/TypeSystem/Common/ThrowHelper.cs:line 35
     at Internal.TypeSystem.ResolutionFailure.Throw() in /_/src/coreclr/tools/Common/TypeSystem/Common/ResolutionFailure.cs:line 105
     at Internal.TypeSystem.Ecma.EcmaModule.GetObject(EntityHandle handle, NotFoundBehavior notFoundBehavior) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaModule.cs:line 432
     at Internal.TypeSystem.Ecma.EcmaSignatureParser.ResolveHandle(EntityHandle handle) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs:line 61
     at Internal.TypeSystem.Ecma.EcmaSignatureParser.ParseTypeImpl(SignatureTypeCode typeCode) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs:line 143
     at Internal.TypeSystem.Ecma.EcmaSignatureParser.ParseType(SignatureTypeCode typeCode) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs:line 98
     at Internal.TypeSystem.Ecma.EcmaSignatureParser.ParseType() in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs:line 309
     at Internal.TypeSystem.Ecma.EcmaSignatureParser.ParseFieldSignature() in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs:line 462
     at Internal.TypeSystem.Ecma.EcmaSignatureParser.ParseFieldSignature(EmbeddedSignatureData[]& embeddedSigData) in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaSignatureParser.cs:line 471
     at Internal.TypeSystem.Ecma.EcmaField.InitializeFieldType() in /_/src/coreclr/tools/Common/TypeSystem/Ecma/EcmaField.cs:line 104
     at Internal.TypeSystem.MetadataFieldLayoutAlgorithm.ComputeInstanceLayout(DefType defType, InstanceLayoutKind layoutKind) in /_/src/coreclr/tools/Common/TypeSystem/Common/MetadataFieldLayoutAlgorithm.cs:line 37
     at Internal.TypeSystem.DefType.ComputeInstanceLayout(InstanceLayoutKind layoutKind) in /_/src/coreclr/tools/Common/TypeSystem/Common/DefType.FieldLayout.cs:line 456
     at ILCompiler.CompilerTypeSystemContext.EnsureLoadableTypeUncached(TypeDesc type) in /_/src/coreclr/tools/Common/Compiler/CompilerTypeSystemContext.Validation.cs:line 317
     at ILCompiler.CompilerTypeSystemContext.EnsureLoadableType(TypeDesc type) in /_/src/coreclr/tools/Common/Compiler/CompilerTypeSystemContext.Validation.cs:line 74
     at ILCompiler.DependencyAnalysis.NodeFactory.CreateConstructedTypeNode(TypeDesc type) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs:line 553
     at Internal.TypeSystem.LockFreeReaderHashtable`2.CreateValueAndEnsureValueIsInTable(TKey key) in /_/src/coreclr/tools/Common/TypeSystem/Common/Utilities/LockFreeReaderHashtable.cs:line 562
     at ILCompiler.DependencyAnalysis.ReflectionInvokeMapNode.AddSignatureDependency(DependencyList& dependencies, NodeFactory factory, TypeDesc type, String reason) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ReflectionInvokeMapNode.cs:line 111
     at ILCompiler.DependencyAnalysis.ReflectedFieldNode.GetStaticDependencies(NodeFactory factory) in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ReflectedFieldNode.cs:line 96
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependenciesImpl(DependencyNodeCore`1 node) in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 182
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.GetStaticDependencies(DependencyNodeCore`1 node) in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 222
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ProcessMarkStack() in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 257
     at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2.ComputeMarkedNodes() in /_/src/coreclr/tools/aot/ILCompiler.DependencyAnalysisFramework/DependencyAnalyzer.cs:line 308
     at ILCompiler.ILScanner.ILCompiler.IILScanner.Scan() in /_/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ILScanner.cs:line 140
     at ILCompiler.Program.<Run>g__RunScanner|4_0(<>c__DisplayClass4_0&) in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 468
     at ILCompiler.Program.Run() in /_/src/coreclr/tools/aot/ILCompiler/Program.cs:line 448
     at ILCompiler.ILCompilerRootCommand.<>c__DisplayClass221_0.<.ctor>b__0(ParseResult result) in /_/src/coreclr/tools/aot/ILCompiler/ILCompilerRootCommand.cs:line 277
tannergooding commented 1 year ago

Is this a known issue? CC. @agocke

MichalStrehovsky commented 1 year ago

@MichalStrehovsky, there are other failures blocking this from repro'ing: #90184

The explanation is at https://github.com/dotnet/runtime/pull/90184#issuecomment-1698521105.

AndyAyersMS commented 1 year ago

Still repros, so will dig in:

https://dev.azure.com/dnceng-public/public/_build/results?buildId=413235&view=logs&jobId=2f6a7d26-0d60-5ade-d191-981fe0847989&j=2f6a7d26-0d60-5ade-d191-981fe0847989&t=950a5a68-ae8a-5bb4-8cad-fb293bb99dc4

AndyAyersMS commented 1 year ago

Haven't been able to repro this locally. Must be doing something wrong.

MichalStrehovsky commented 1 year ago

Haven't been able to repro this locally. Must be doing something wrong.

Here's the workflow I'd use for this (IIRC this was arm64 linux specific so need to build on linux):

(The UseNativeAotCoreLib=true might just be a cargo cult I do)

If it reproes and you'd like to move investigation to Windows, edit System.Numerics.Vectors.Tests.csproj and add <ItemGroup><IlcArg Include="--make-repro-path:/some/path/to/a/directory"></ItemGroup>. This will create a zip under /some/path/to/a/directory - copy it to your Windows machine, unpack and run ILC with the RSP file in the ZIP.

AndyAyersMS commented 1 year ago

Thanks, I'll give this a try on my volterra + wsl2.

MichalStrehovsky commented 1 year ago

The -arch arm64 will make it take the crossbuild paths, so volterra shouldn't be needed, just a linux machine (the build machines we use in the lab are not arm64 machines either, only the Helix executors)

AndyAyersMS commented 1 year ago

Reproed just fine on the volterra so I'm set for now.

  Generating native code
  ILC: /home/andya/repos/runtime0/src/coreclr/jit/codegencommon.cpp:3944
  ILC: Assertion failed 'regArgTab[argNum].slot == 2' in 'System.Collections.Generic.GenericEqualityComparer`1[System.Numerics.Plane]:GetHashCode(System.Numerics.Plane):int:this' during 'Generate code' (IL size 24; hash 0x7039dd06; FullOpts)
AndyAyersMS commented 1 year ago

This method gets 1 arg apparently passed in 4 float registers:

;  V01 arg1           simd16  HFA(float)  multireg-arg <System.Numerics.Plane>
Arg #1    passed in register(s) d0,d1,d2,d3

I think the jit gets confused during prolog generation and somehow fails to handle it as an HFA and so falls into a code path expecting the arg to be passed in 2 registers, not 4. The slot num when we assert is 3 which looks plausible for a 4 element HFA.

(regArgElem [9]) regArgTab = {
  [0] = {
    varNum = 1
    type = TYP_FLOAT
    trashBy = 0
    slot = '\x01'
    stackArg = false
    writeThru = false
    processed = false
    circular = false
  }
  [1] = {
    varNum = 1
    type = TYP_FLOAT
    trashBy = 0
    slot = '\x02'
    stackArg = false
    writeThru = false
    processed = false
    circular = false
  }
  [2] = {
    varNum = 1
    type = TYP_FLOAT
    trashBy = 0
    slot = '\x03'
    stackArg = false
    writeThru = false
    processed = false
    circular = false
  }
  [3] = {
    varNum = 1
    type = TYP_FLOAT
    trashBy = 0
    slot = '\x04'
    stackArg = false
    writeThru = false
    processed = false
    circular = false

This method is pretty convoluted so will have to spend more time debugging.

AndyAyersMS commented 1 year ago
;  V01 arg1         [V01,T01] (  4,  4   )  simd16  ->   d3         HFA(float)  multireg-arg ld-addr-op single-def <System.Numerics.Plane>

LSRA assigns reg d3 for V01 and the arg comes in in d0-d3. So it self-conflicts in an odd way that the prolog codegen can't handle.

@dotnet/jit-contrib any ideas here?

BruceForstall commented 1 year ago

The register naming here is very confusing in our dumps. The incoming args are 32-bit floats, so we should say they are passed in s0,s1,s2,s3 (not d0/d1/d2/d3). And the register assignment is of the entire value, so q3 (not d3).

Or if we don't know the sizes maybe we should say v0,v1,v2,v3 and v3 instead?

any ideas here?

Maybe this is a new self-conflict that we haven't seen before due to change in HFA or struct handling or ABI or struct promotion?

AndyAyersMS commented 1 year ago

I don't think there is an actual liveness conflict here, though in similar cases there could be. But still not sure I understand all that is going on in this code.

We get the args in s0-s3 and they all need to end up in say v3. Currently the code doesn't detect any circularity (which I think is right) and when it goes to handle s0 it sees that s3 is live, so it defers ("Cannot trash a currently live register argument."). For s1 it doesn't do any conflict check and defers since it expects this part of the hfa to be handled when the first part is. For s2 it should do likewise but asserts that there is just one other part (the code is probably just expecting the "simd passed in 2 registers" behavior here).

I think the right logic is more like: if the dest reg is one of the hfa regs, and the hfa reg corresponding to the dest reg is already in the right spot in the dest reg, just insert the other hfa regs. Else spill or insert the "misplaced" hfa reg, then insert the other regs. So basically, a finer-grained check to see if the spot this hfa arg reg needs to occupy in the dest is live, not just if any part of the dest reg is live.

I have a hacky version where I comment out the assert block and then update the "Cannot trash a currently live register argument" part, so that if it sees a conflict, it then tries to screen dependence for the HFA cases specially. On this repro case it produces plausible looking code but it doesn't handle the case where say the HFA was passed in s1-s4 but ends up in v3 where s3 needs to be moved within v3 or copied elsewhere first.

G_M8953_IG01:
IN0008: 000000      stp     fp, lr, [sp, #-0x20]!
IN0009: 000004      mov     fp, sp
IN000a: 000008      mov     v3.s[0], v0.s[0]
IN000b: 00000C      mov     v3.s[1], v1.s[0]
IN000c: 000010      mov     v3.s[2], v2.s[0]
G_M8953_IG02:
IN0001: 000014      mov     v0.16b, v3.16b

I also tried coming up with a non-AOT repro but no luck so far.

I'll grab an SPMI collection and link it here so if anyone else wants to poke at this it is a bit simpler to do.

BruceForstall commented 1 year ago

I think the right logic is more like: ...

Sounds right. But does the JIT have "insert into the right place" logic already? Could it do something simple and spill the entire HFA to stack, then reload v3/q3 from the stack location?

AndyAyersMS commented 1 year ago

But does the JIT have "insert into the right place" logic already?

Maybe not for slot 0, but it does have something for the other slots -- need to check if the opcode we use for slot 0 is a partial or total write; we need it to be partial since some bits in the destination are live (just not the whole thing). Currently my hack adds some special casing for slot 0 to make sure we insert.

Also I am not confident we get the sizes right for the inserts; we always use EA_4BYTE which works fine for float -- perhaps we never see HFAs of doubles or other sizes? But seems like we can.

Could it do something simple and spill the entire HFA to stack, then reload v3/q3 from the stack location?

I assume it could, yeah. I am not sure where best to integrate that into what's there now.

AndyAyersMS commented 1 year ago

Here's the SPMI collection: fail.mc.txt