dotnet / diagnostics

This repository contains the source code for various .NET Core runtime diagnostic tools and documents.
MIT License
1.18k stars 354 forks source link

dotnet trace convert causes a crash on s390x #4506

Open omajid opened 1 year ago

omajid commented 1 year ago

Description

With .NET 8 RC1 on s390x, running dotnet trace convert causes a crash

Reproduction Steps

$ dotnet tool update -g dotnet-trace
...
Welcome to .NET 8.0!
---------------------
SDK Version: 8.0.100-rc.1.23455.1
...
$ export PATH=$HOME/.dotnet/tools:$PATH
$ dotnet-trace collect -o dotnettrace.nettrace -- dotnet new console --output dotnettrace
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile 
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile 

Launching: dotnet new console --output dotnettrace 
Process        : /usr/lib64/dotnet/dotnet
Output File    : dotnettrace.nettrace

Trace completed.
$ dotnet-trace convert dotnettrace.nettrace --format Speedscope
Writing:    dotnettrace.speedscope.json

=================================================================
    Native Crash Reporting
=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries 
used by your application.
=================================================================

=================================================================
    Native stacktrace:
=================================================================
    0x3ffa1e6ad54 - /usr/lib64/dotnet/shared/Microsoft.NETCore.App/8.0.0-rc.1.23419.4/libcoreclr.so : 
    0x3ffa1e18bf0 - /usr/lib64/dotnet/shared/Microsoft.NETCore.App/8.0.0-rc.1.23419.4/libcoreclr.so : 
    0x3ffa1d46872 - /usr/lib64/dotnet/shared/Microsoft.NETCore.App/8.0.0-rc.1.23419.4/libcoreclr.so : 
    0x3ffc6cd2490 - linux-vdso64.so.1 : __kernel_rt_sigreturn
    0x3ffa1bf0058 - Unknown

=================================================================
    External Debugger Dump:
=================================================================
(lldb) command source -s 0 '/tmp/mono-gdb-commands.37673'
Executing commands in '/tmp/mono-gdb-commands.37673'.
(lldb) process attach --pid 37673
Process 37673 stopped
* thread dotnet/runtime#1, name = 'dotnet-trace', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa23dc35a libc.so.6`wait4 + 106
libc.so.6`wait4:
->  0x3ffa23dc35a <+106>: lghi   %r3, -4096
    0x3ffa23dc35e <+110>: clgrjh %r2, %r3, 4396473500566   ; <+166>
    0x3ffa23dc364 <+116>: lr     %r11, %r2
    0x3ffa23dc366 <+118>: lgr    %r2, %r1
  thread dotnet/runtime#2, name = 'SGen worker', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa239ab26 libc.so.6`__GI___futex_abstimed_wait_cancelable64 + 198
libc.so.6`__GI___futex_abstimed_wait_cancelable64:
->  0x3ffa239ab26 <+198>: lgr    %r6, %r2
    0x3ffa239ab2a <+202>: lgr    %r2, %r1
    0x3ffa239ab2e <+206>: brasl  %r14, 4396473229440       ; __GI___pthread_disable_asynccancel
    0x3ffa239ab34 <+212>: j      4396473232082             ; <+114>
  thread dotnet/runtime#3, name = '.NET EventPipe', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa240a5a2 libc.so.6`__poll + 98
libc.so.6`__poll:
->  0x3ffa240a5a2 <+98>:  lghi   %r3, -4096
    0x3ffa240a5a6 <+102>: clgrjh %r2, %r3, 4396473689566   ; <+158>
    0x3ffa240a5ac <+108>: lr     %r11, %r2
    0x3ffa240a5ae <+110>: lgr    %r2, %r1
  thread dotnet/runtime#4, name = 'Finalizer', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa239ab26 libc.so.6`__GI___futex_abstimed_wait_cancelable64 + 198
libc.so.6`__GI___futex_abstimed_wait_cancelable64:
->  0x3ffa239ab26 <+198>: lgr    %r6, %r2
    0x3ffa239ab2a <+202>: lgr    %r2, %r1
    0x3ffa239ab2e <+206>: brasl  %r14, 4396473229440       ; __GI___pthread_disable_asynccancel
    0x3ffa239ab34 <+212>: j      4396473232082             ; <+114>
  thread dotnet/runtime#5, name = '.NET SigHandler', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa2405e24 libc.so.6`read + 100
libc.so.6`read:
->  0x3ffa2405e24 <+100>: lgr    %r11, %r2
    0x3ffa2405e28 <+104>: lghi   %r2, -4096
    0x3ffa2405e2c <+108>: clgrjh %r11, %r2, 4396473671266  ; <+162>
    0x3ffa2405e32 <+114>: lgr    %r2, %r1
Executable module set to "/root/.dotnet/tools/dotnet-trace".
Architecture set to: s390x-ibm-linux.
(lldb) thread list
Process 37673 stopped
* thread dotnet/runtime#1: tid = 37673, 0x000003ffa23dc35a libc.so.6`wait4 + 106, name = 'dotnet-trace', stop reason = signal SIGSTOP
  thread dotnet/runtime#2: tid = 37674, 0x000003ffa239ab26 libc.so.6`__GI___futex_abstimed_wait_cancelable64 + 198, name = 'SGen worker', stop reason = signal SIGSTOP
  thread dotnet/runtime#3: tid = 37675, 0x000003ffa240a5a2 libc.so.6`__poll + 98, name = '.NET EventPipe', stop reason = signal SIGSTOP
  thread dotnet/runtime#4: tid = 37676, 0x000003ffa239ab26 libc.so.6`__GI___futex_abstimed_wait_cancelable64 + 198, name = 'Finalizer', stop reason = signal SIGSTOP
  thread dotnet/runtime#5: tid = 37677, 0x000003ffa2405e24 libc.so.6`read + 100, name = '.NET SigHandler', stop reason = signal SIGSTOP
(lldb) thread backtrace all
* thread dotnet/runtime#1, name = 'dotnet-trace', stop reason = signal SIGSTOP
  * frame #0: 0x000003ffa23dc35a libc.so.6`wait4 + 106
    frame dotnet/runtime#1: 0x000003ffa1e6ae84 libcoreclr.so`mono_dump_native_crash_info + 388
    frame dotnet/runtime#2: 0x000003ffa1e18bf0 libcoreclr.so`mono_handle_native_crash + 320
    frame dotnet/runtime#3: 0x000003ffa1d46872 libcoreclr.so`mono_sigsegv_signal_handler_debug + 1058
    frame dotnet/runtime#4: 0x000003ffc6cd2490 [vdso]`__kernel_rt_sigreturn
  thread dotnet/runtime#2, name = 'SGen worker', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa239ab26 libc.so.6`__GI___futex_abstimed_wait_cancelable64 + 198
    frame dotnet/runtime#1: 0x000003ffa239d774 libc.so.6`pthread_cond_wait@@GLIBC_2.3.2 + 580
    frame dotnet/runtime#2: 0x000003ffa2107672 libcoreclr.so`thread_func + 514
    frame dotnet/runtime#3: 0x000003ffa239e2e6 libc.so.6`start_thread + 326
    frame dotnet/runtime#4: 0x000003ffa232fdbe libc.so.6`thread_start + 16
  thread dotnet/runtime#3, name = '.NET EventPipe', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa240a5a2 libc.so.6`__poll + 98
    frame dotnet/runtime#1: 0x000003ffa1f42e8a libcoreclr.so`ds_ipc_poll + 442
    frame dotnet/runtime#2: 0x000003ffa1f3ed74 libcoreclr.so`ds_ipc_stream_factory_get_next_available_stream + 740
    frame dotnet/runtime#3: 0x000003ffa1f3cfbc libcoreclr.so`server_thread + 188
    frame dotnet/runtime#4: 0x000003ffa1f41fd8 libcoreclr.so`ep_rt_thread_mono_start_func + 56
    frame dotnet/runtime#5: 0x000003ffa239e2e6 libc.so.6`start_thread + 326
    frame dotnet/runtime#6: 0x000003ffa232fdbe libc.so.6`thread_start + 16
  thread dotnet/runtime#4, name = 'Finalizer', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa239ab26 libc.so.6`__GI___futex_abstimed_wait_cancelable64 + 198
    frame dotnet/runtime#1: 0x000003ffa23a63d8 libc.so.6`do_futex_wait.constprop.0 + 40
    frame dotnet/runtime#2: 0x000003ffa23a6478 libc.so.6`__new_sem_wait_slow64.constprop.0 + 152
    frame dotnet/runtime#3: 0x000003ffa20815a2 libcoreclr.so`finalizer_thread + 418
    frame dotnet/runtime#4: 0x000003ffa204b212 libcoreclr.so`start_wrapper + 466
    frame dotnet/runtime#5: 0x000003ffa239e2e6 libc.so.6`start_thread + 326
    frame dotnet/runtime#6: 0x000003ffa232fdbe libc.so.6`thread_start + 16
  thread dotnet/runtime#5, name = '.NET SigHandler', stop reason = signal SIGSTOP
    frame #0: 0x000003ffa2405e24 libc.so.6`read + 100
    frame dotnet/runtime#1: 0x000003ff9f792148 libSystem.Native.so`SignalHandlerLoop + 104
    frame dotnet/runtime#2: 0x000003ffa239e2e6 libc.so.6`start_thread + 326
    frame dotnet/runtime#3: 0x000003ffa232fdbe libc.so.6`thread_start + 16
(lldb) detach
Process 37673 detached
(lldb) quit

=================================================================
    Basic Fault Address Reporting
=================================================================
Memory around native instruction pointer (0x3ffa1bf0058):0x3ffa1bf0048  00 00 15 20 c0 74 00 00 00 ed e3 80 f0 a0 00 04  ... .t..........
0x3ffa1bf0058  e3 20 80 00 00 14 e3 20 b0 00 00 50 e3 20 80 04  . ..... ...P. ..
0x3ffa1bf0068  00 14 c0 01 7f ff ff ff b9 e4 00 22 e3 20 b0 04  ...........". ..
0x3ffa1bf0078  00 50 e3 20 80 04 00 14 c0 01 80 00 00 00 b9 e4  .P. ............

=================================================================
    Managed Stacktrace:
=================================================================
      at Microsoft.Diagnostics.Tracing.EventPipeEventHeader:ReadFromFormatV4 <0x00058>
      at Microsoft.Diagnostics.Tracing.EventPipe.EventCache:ProcessEventBlock <0x00426>
      at Microsoft.Diagnostics.Tracing.EventPipeEventBlock:ReadBlockContents <0x000fe>
      at Microsoft.Diagnostics.Tracing.EventPipeBlock:FromStream <0x0016c>
      at FastSerialization.Deserializer:ReadObjectDefinition <0x00426>
      at FastSerialization.Deserializer:ReadObject <0x003e8>
      at Microsoft.Diagnostics.Tracing.EventPipeEventSource:Process <0x00102>
      at Microsoft.Diagnostics.Tracing.Etlx.TraceLog:CopyRawEvents <0x006fc>
      at <>c__DisplayClass120_0:<FastSerialization.IFastSerializable.ToStream>b__0 <0x0009c>
      at FastSerialization.DeferedRegion:Write <0x00072>
      at Microsoft.Diagnostics.Tracing.Etlx.TraceLog:FastSerialization.IFastSerializable.ToStream <0x002ca>
      at FastSerialization.Serializer:WriteObjectData <0x00080>
      at FastSerialization.Serializer:WriteObjectRef <0x0032c>
      at FastSerialization.Serializer:Write <0x00054>
      at FastSerialization.Serializer:.ctor <0x001f0>
      at FastSerialization.Serializer:.ctor <0x0008c>
      at Microsoft.Diagnostics.Tracing.Etlx.TraceLog:CreateFromEventPipeEventSources <0x00372>
      at Microsoft.Diagnostics.Tracing.Etlx.TraceLog:CreateFromEventPipeDataFile <0x000c2>
      at Microsoft.Diagnostics.Tools.Trace.TraceFileFormatConverter:Convert <0x000b8>
      at Microsoft.Diagnostics.Tools.Trace.TraceFileFormatConverter:ConvertToFormat <0x00172>
      at Microsoft.Diagnostics.Tools.Trace.ConvertCommandHandler:ConvertFile <0x004d4>
      at <Module>:runtime_invoke_int_object_object_int_object <0x000ea>
      at <unknown> <0xffffffff>
      at System.Reflection.RuntimeMethodInfo:InternalInvoke <0x0006e>
      at System.Reflection.MethodBaseInvoker:InterpretedInvoke_Method <0x000a8>
      at System.Reflection.MethodBaseInvoker:InvokeDirectByRefWithFewArgs <0x00254>
      at System.Reflection.MethodBaseInvoker:InvokeWithFewArgs <0x007f8>
      at System.Reflection.RuntimeMethodInfo:Invoke <0x00458>
      at System.Reflection.MethodBase:Invoke <0x0007a>
      at System.Delegate:DynamicInvokeImpl <0x00514>
      at System.MulticastDelegate:DynamicInvokeImpl <0x00060>
      at System.Delegate:DynamicInvoke <0x0005c>
      at <InvokeAsync>d__12:MoveNext <0x002fe>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at System.CommandLine.Invocation.ModelBindingCommandHandler:InvokeAsync <0x00112>
      at <<BuildInvocationChain>b__0>d:MoveNext <0x001c8>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at <>c__DisplayClass4_0:<BuildInvocationChain>b__0 <0x0011e>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseParseErrorReporting>b__21_0>d:MoveNext <0x00162>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at <>c:<UseParseErrorReporting>b__21_0 <0x00118>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseHelp>b__0>d:MoveNext <0x000da>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at <>c__DisplayClass16_0:<UseHelp>b__0 <0x00162>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseVersionOption>b__0>d:MoveNext <0x00164>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at <>c__DisplayClass25_0:<UseVersionOption>b__0 <0x00162>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseTypoCorrections>b__0>d:MoveNext <0x00218>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at <>c__DisplayClass23_0:<UseTypoCorrections>b__0 <0x00162>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseSuggestDirective>b__22_0>d:MoveNext <0x00232>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at <>c:<UseSuggestDirective>b__22_0 <0x00118>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseParseDirective>b__20_0>d:MoveNext <0x0016e>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at <>c:<UseParseDirective>b__20_0 <0x00118>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseDebugDirective>b__11_0>d:MoveNext <0x00418>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at <>c:<UseDebugDirective>b__11_0 <0x00118>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<RegisterWithDotnetSuggest>b__10_0>d:MoveNext <0x003fa>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at <>c:<RegisterWithDotnetSuggest>b__10_0 <0x00118>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <>c:<UseEnvironmentVariableDirective>b__12_0 <0x00386>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<UseExceptionHandler>b__0>d:MoveNext <0x000b2>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at <>c__DisplayClass14_0:<UseExceptionHandler>b__0 <0x00162>
      at <>c__DisplayClass4_2:<BuildInvocationChain>b__3 <0x0006c>
      at <<CancelOnProcessTermination>b__5_0>d:MoveNext <0x002ae>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at <>c:<CancelOnProcessTermination>b__5_0 <0x0011e>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <>c__DisplayClass4_1:<BuildInvocationChain>b__2 <0x001ac>
      at <InvokeAsync>d__2:MoveNext <0x00262>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at System.CommandLine.Invocation.InvocationPipeline:InvokeAsync <0x00118>
      at <InvokeAsync>d__0:MoveNext <0x000ba>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000f0>
      at System.CommandLine.Parsing.ParseResultExtensions:InvokeAsync <0x00112>
      at <InvokeAsync>d__3:MoveNext <0x000a4>
      at System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start <0x000fc>
      at System.CommandLine.Parsing.ParserExtensions:InvokeAsync <0x00162>
      at Microsoft.Diagnostics.Tools.Trace.Program:Main <0x00256>
      at Microsoft.Diagnostics.Tools.Trace.Program:<Main> <0x0004e>
      at <Module>:runtime_invoke_int_object <0x000da>
=================================================================

Expected behavior

Either a clear error message or tracing works

Actual behavior

Native crash with stacktrace

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

omajid commented 1 year ago

cc @uweigand @Vishwanatha-HD

omajid commented 1 year ago

The same commands work fine on ppc64le.

uweigand commented 1 year ago

@stefan-sf-ibm could this be yet more endian issues with EventPipe ? You looked into this before ...

stefan-sf-ibm commented 1 year ago

This is a known problem with the TraceEvent library which is still little-endian only. As for now, I suggest to convert the trace on a little-endian machine in order to further evaluate it.