dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.54k stars 4.54k forks source link

Test failure profiler/eventpipe/eventpipe/eventpipe.sh #66174

Open VincentBu opened 2 years ago

VincentBu commented 2 years ago

Run: runtime-coreclr outerloop 20220303.2

Failed test:

CoreCLR OSX x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

Error message:

cmdLine:/private/tmp/helix/working/C5030A0F/w/B5FB0987/e/profiler/eventpipe/eventpipe/eventpipe.sh Timed Out (timeout in milliseconds: 600000 from variable __TestTimeout, start: 3/3/2022 12:33:40 PM, end: 3/3/2022 12:43:41 PM)

Return code:      -100
Raw output file:      /tmp/helix/working/C5030A0F/w/B5FB0987/uploads/Reports/profiler.eventpipe/eventpipe/eventpipe.output.txt
Raw output:
BEGIN EXECUTION
/tmp/helix/working/C5030A0F/p/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false eventpipe.dll ''
Profiler path: /private/tmp/helix/working/C5030A0F/w/B5FB0987/e/profiler/eventpipe/eventpipe/libProfiler.dylib
Profiler.dll!DllGetClassObject
Profiler.dll!Profiler::Initialize
This method being jitted should trigger events firing...

cmdLine:/private/tmp/helix/working/C5030A0F/w/B5FB0987/e/profiler/eventpipe/eventpipe/eventpipe.sh Timed Out (timeout in milliseconds: 600000 from variable __TestTimeout, start: 3/3/2022 12:33:40 PM, end: 3/3/2022 12:43:41 PM)
Test Harness Exitcode is : -100
To run the test:

set CORE_ROOT=/tmp/helix/working/C5030A0F/p
/private/tmp/helix/working/C5030A0F/w/B5FB0987/e/profiler/eventpipe/eventpipe/eventpipe.sh
Expected: True
Actual:   False

Stack trace
   at profiler_eventpipe._eventpipe_eventpipe_._eventpipe_eventpipe_sh()
BruceForstall commented 2 years ago

This timeout is pervasive: Kusto data shows that it happens in many pipelines, but mostly on osx/x64.

@dotnet/area-system-diagnostics-tracing

fanyang-mono commented 2 years ago

This happened again for rolling build #20220308.1 on CoreCLR Pri0 Runtime Tests Run OSX x64 checked. The link to the log is https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-a1a9de6e17b849f196/profiler.eventpipe/1/console.1158da49.log?sv=2019-07-07&se=2022-03-28T09%3A24%3A17Z&sr=c&sp=rl&sig=Xqtq4mBksJ5vn2SeQpc00q9oqntl2GhaHPQ5vDmaJAM%3D

agocke commented 2 years ago

This is now blocking the runtime builds as well @tommcdon could someone take a look asap?

tommcdon commented 2 years ago

@agocke taking a look

BruceForstall commented 2 years ago

Here's where it started failing for the nightly scheduled jitstress run:

https://dev.azure.com/dnceng/public/_build/results?buildId=1644509&view=ms.vss-test-web.build-test-results-tab

hoyosjs commented 2 years ago

There's a few families of errors in here. Most of them were old and targeted to specific PR's

There's this:

Process terminated. Infinite recursion during resource lookup within System.Private.CoreLib.  This may be a bug in System.Private.CoreLib, or potentially in certain extensibility points such as assembly resolve events or CultureInfo names.  Resource name: Arg_NullReferenceException
   at System.Environment.FailFast(System.String)
   at System.SR.InternalGetResourceString(System.String)
   at System.SR.GetResourceString(System.String)
   at System.NullReferenceException..ctor()
   at System.Char.IsWhiteSpaceLatin1(Char)
   at System.Resources.ResourceManager.IsDefaultType(System.String, System.String)
   at System.Resources.ManifestBasedResourceGroveler.CanUseDefaultResourceClasses(System.String, System.String)
   at System.Resources.ManifestBasedResourceGroveler.CreateResourceSet(System.IO.Stream, System.Reflection.Assembly)
   at System.Resources.ManifestBasedResourceGroveler.GrovelForResourceSet(System.Globalization.CultureInfo, System.Collections.Generic.Dictionary`2<System.String,System.Resources.ResourceSet>, Boolean, Boolean)
   at System.Resources.ResourceManager.InternalGetResourceSet(System.Globalization.CultureInfo, Boolean, Boolean)
   at System.Resources.ResourceManager.GetString(System.String, System.Globalization.CultureInfo)
   at System.SR.InternalGetResourceString(System.String)
   at System.SR.GetResourceString(System.String)
   at System.NullReferenceException..ctor()
   at System.Char.IsWhiteSpaceLatin1(Char)
   at System.MemoryExtensions.Trim(System.ReadOnlySpan`1<Char>)
   at System.Guid.TryParseGuid(System.ReadOnlySpan`1<Char>, GuidResult ByRef)
   at System.Guid..ctor(System.String)
   at EventPipeTests.EventPipe..cctor()
   at EventPipeTests.EventPipe.Main(System.String[])

These type of simd asserts that have been reported elsewhere:

Assert failure(PID 12180 [0x00002f94], Thread: 12222 [0x2fbe]): Assertion failed 'IsBaselineSimdIsaSupportedDebugOnly()' in 'Microsoft.Diagnostics.Tracing.EventPipeEventSource:CheckForWellKnownEventFields(Microsoft.Diagnostics.Tracing.EventPipeEventMetaDataHeader):PayloadFetchClassInfo:this' during 'Morph - Inlining' (IL size 492; hash 0x78e35964; FullOpts)

All the ones that come from the profiler test itself are timeouts in OSX, regardless of jit modes.

BruceForstall commented 2 years ago

@hoyosjs Why did you remove the "blocking-outerloop" label? Unless this test has been fixed or disabled, it is, indeed, blocking clean outerloop test runs.

VincentBu commented 2 years ago

Failed again in: runtime-coreclr jitstress-isas-x86 20220312.1

Failed test:

CoreCLR OSX x64 Checked jitstress_isas_nohwintrinsic_nosimd @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

CoreCLR OSX x64 Checked jitstress_isas_x86_nosse @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

CoreCLR OSX x64 Checked jitstress_isas_nohwintrinsic @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

CoreCLR OSX x64 Checked jitstress_isas_x86_nolzcnt @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

CoreCLR OSX x64 Checked jitstress_isas_x86_nohwintrinsic @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

CoreCLR OSX x64 Checked jitstress_isas_x86_noaes @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

CoreCLR OSX x64 Checked jitstress_isas_x86_nofma @ OSX.1200.Amd64.Open

- profiler/eventpipe/eventpipe/eventpipe.sh

Error message:

cmdLine:/private/tmp/helix/working/B50109A6/w/B95809E4/e/profiler/eventpipe/eventpipe/eventpipe.sh Timed Out (timeout in milliseconds: 1800000 from variable __TestTimeout, start: 3/12/2022 1:34:43 PM, end: 3/12/2022 2:04:43 PM)

Return code:      -100
Raw output file:      /tmp/helix/working/B50109A6/w/B95809E4/uploads/Reports/profiler.eventpipe/eventpipe/eventpipe.output.txt
Raw output:
BEGIN EXECUTION
/tmp/helix/working/B50109A6/p/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false eventpipe.dll ''
Profiler path: /private/tmp/helix/working/B50109A6/w/B95809E4/e/profiler/eventpipe/eventpipe/libProfiler.dylib
Profiler.dll!DllGetClassObject
Profiler.dll!Profiler::Initialize
This method being jitted should trigger events firing...

cmdLine:/private/tmp/helix/working/B50109A6/w/B95809E4/e/profiler/eventpipe/eventpipe/eventpipe.sh Timed Out (timeout in milliseconds: 1800000 from variable __TestTimeout, start: 3/12/2022 1:34:43 PM, end: 3/12/2022 2:04:43 PM)
Test Harness Exitcode is : -100
To run the test:

set CORE_ROOT=/tmp/helix/working/B50109A6/p
/private/tmp/helix/working/B50109A6/w/B95809E4/e/profiler/eventpipe/eventpipe/eventpipe.sh
Expected: True
Actual:   False

Stack trace
   at profiler_eventpipe._eventpipe_eventpipe_._eventpipe_eventpipe_sh()
fanyang-mono commented 2 years ago

Should the test be disabled for now? I saw this being hit on rolling build quite often.

BruceForstall commented 2 years ago

Note: this test was disabled in #66606. Re-enable when fixing/closing this issue.

tommcdon commented 2 years ago

@davmason PTAL