dotnet / diagnostics

This repository contains the source code for various .NET Core runtime diagnostic tools and documents.
MIT License
1.18k stars 354 forks source link

SOS DumpLog crashing lldb #1837

Open jkotas opened 3 years ago

jkotas commented 3 years ago

Repro:

Run dumplog on a crashdump from https://github.com/dotnet/runtime/issues/46100#issue-768267242

Result:

/repro/helix/shared/Microsoft.NETCore.App/6.0.0 $ lldb -c core.1000.22 ./dotnet
Added Microsoft public symbol server
(lldb) target create "./dotnet" --core "core.1000.22"
Core file '/repro/helix/shared/Microsoft.NETCore.App/6.0.0/core.1000.22' (x86_64) was loaded.
(lldb) dumplog log.txt
Stack dump:
0.      Program arguments: lldb -c core.1000.22 ./dotnet
1.      HandleCommand(command = "dumplog log.txt")
Segmentation fault

In case the helix blob store gets reclaimed, I have save the dump and binaries at \jkotas9\drops\dumplogcrash

jkotas commented 3 years ago

It is crashing here:

* thread #1, name = 'lldb', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
  * frame #0: 0x00007fe56420e1ad
    frame #1: 0x00007fe56414fd89
    frame #2: 0x00007fe65915eb4a libsos.so`IsMethodDesc(unsigned long) + 122
    frame #3: 0x00007fe65912a6b8 libsos.so`formatOutput(IDebugDataSpaces*, _IO_FILE*, char*, unsigned int, double, unsigned long, void**) + 1784
    frame #4: 0x00007fe65912b2f4 libsos.so`StressLog::Dump(unsigned long, char const*, IDebugDataSpaces*) + 2404
    frame #5: 0x00007fe6591446d9 libsos.so`DumpLog + 585
mikem8361 commented 3 years ago

@jkotas I can't get to the dump. I need libsos.so symbols to diagnose this problem.

jkotas commented 3 years ago

Can you try now? \\jkotas9\drops\dumplogcrash\core.1000.22

mikem8361 commented 3 years ago

This may take a while. I'll need to repro it on an apline (linux musl) docker image. Off the top of my head, it looks like a fault in the DAC since that is all IsMethodDesc does.

jkotas commented 3 years ago

Here are the steps to repro if it helps (you may need to modify c:\repro to where you downloaded the payload):

docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined  -v c:\repro:/repro mcr.microsoft.com/dotnet-buildtools/prereqs:alpine-3.12-helix-20200602002622-e06dc59

sudo apk add lldb-dev
curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin
export PATH=$PATH:/home/helixbot/.dotnet:/home/helixbot/.dotnet/tools
export DOTNET_ROOT=/home/helixbot/.dotnet
dotnet tool install --global dotnet-sos
dotnet-sos install

cd /repro
lldb dotnet -c core.1000.22

dumplog log.txt

You can use docker container list + docker exec -it <CONTAINER ID> /bin/bash to create second console, so that you can attach lldb to the crashing lldb.

mikem8361 commented 3 years ago

I repro'ed your issue and have more details but not a solution yet:

* thread #1, name = 'lldb', stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
  * frame #0: 0x00007f22c64401ad libmscordaccore.so`LCGMethodResolver::GetManagedResolver() [inlined] __TPtrBase::GetAddr(this=0x0000000000000000) const at daccess.h:884:16
    frame #1: 0x00007f22c64401ad libmscordaccore.so`LCGMethodResolver::GetManagedResolver() [inlined] __DPtr<Object>::__DPtr(rhs=0x0000000000000000) at daccess.h:1160
    frame #2: 0x00007f22c64401ad libmscordaccore.so`LCGMethodResolver::GetManagedResolver() [inlined] ObjectFromHandle(handle=0) at gchandleutilities.h:48
    frame #3: 0x00007f22c644017e libmscordaccore.so`LCGMethodResolver::GetManagedResolver(this=<unavailable>) at dynamicmethod.cpp:1456
    frame #4: 0x00007f22c6381d89 libmscordaccore.so`ClrDataAccess::GetMethodDescData(this=<unavailable>, methodDesc=139965684439328, ip=0, methodDescData=0x00007ffd221e5950, cRevertedRejitVersions=<unavailable>, rgRevertedRejitData=0x0000000000000000, pcNeededRevertedRejitData=<unavailable>) at request.cpp:1066:50
    frame #5: 0x00007f2302840b4a libsos.so`IsMethodDesc(unsigned long) + 122
    frame #6: 0x00007f230280c6b8 libsos.so`formatOutput(IDebugDataSpaces*, _IO_FILE*, char*, unsigned int, double, unsigned long, void**) + 1784
    frame #7: 0x00007f230280d2f4 libsos.so`StressLog::Dump(unsigned long, char const*, IDebugDataSpaces*) + 2404
    frame #8: 0x00007f23028266d9 libsos.so`DumpLog + 585
    frame #9: 0x00007f2302c22c3d libsosplugin.so`sosCommand::DoExecute(lldb::SBDebugger, char**, lldb::SBCommandReturnObject&) + 509
    frame #10: 0x00007f23076c5fc7 liblldb.so.10`___lldb_unnamed_symbol1463$$liblldb.so.10 + 227
    frame #11: 0x00007f23079e0600 liblldb.so.10`___lldb_unnamed_symbol16011$$liblldb.so.10 + 404
    frame #12: 0x00007f23079da4bf liblldb.so.10`___lldb_unnamed_symbol15768$$liblldb.so.10 + 2103
    frame #13: 0x00007f23079dcd6c liblldb.so.10`___lldb_unnamed_symbol15803$$liblldb.so.10 + 284
    frame #14: 0x00007f230796714e liblldb.so.10`___lldb_unnamed_symbol13067$$liblldb.so.10 + 334
    frame #15: 0x00007f2307952177 liblldb.so.10`___lldb_unnamed_symbol12663$$liblldb.so.10 + 81
    frame #16: 0x00007f23079dd67a liblldb.so.10`___lldb_unnamed_symbol15811$$liblldb.so.10 + 160
    frame #17: 0x00007f23076f4444 liblldb.so.10`lldb::SBDebugger::RunCommandInterpreter(bool, bool) + 256
    frame #18: 0x000055e653f07670 lldb`___lldb_unnamed_symbol16$$lldb + 2098
    frame #19: 0x000055e653f083f4 lldb`___lldb_unnamed_symbol25$$lldb + 1309

The fault happens in the DAC call to GetManagedResolver() when SOS validates the MethodDesc.

     // Set this above Dario since you know how to tell if dynamic
        if (methodDescData->bIsDynamic)
        {
            DynamicMethodDesc *pDynamicMethod = PTR_DynamicMethodDesc(TO_TADDR(methodDesc));
            if (pDynamicMethod)
            {
                LCGMethodResolver *pResolver = pDynamicMethod->GetLCGMethodResolver();
                if (pResolver)
                {
                    OBJECTREF value = pResolver->GetManagedResolver();    <<<<<<<<<<<<<<<<<<<<<<<
                    if (value)
                    {
                        FieldDesc *pField = (&g_CoreLib)->GetField(FIELD__DYNAMICRESOLVER__DYNAMIC_METHOD);
                        _ASSERTE(pField);
                        value = pField->GetRefValue(value);
                        if (value)
                        {
                            methodDescData->managedDynamicMethodObject = PTR_HOST_TO_TADDR(value);
                        }
                    }
                }
            }
        }

I don't know this code that well but I'll continue to investigate.