microsoft / clrmd

Microsoft.Diagnostics.Runtime is a set of APIs for introspecting processes and dumps.
MIT License
1.04k stars 256 forks source link

ClrMD is not working with .NET 5 Single File Applications on Linux #868

Open ppekrol opened 3 years ago

ppekrol commented 3 years ago

OS: Ubuntu 20.04 (works fine on Windows 10) ClrMD version: 1.x and 2.x .NET: 5.0.0

Code (1.x):

using (var dataTarget = DataTarget.AttachToProcess(processId, attachTimeout, AttachFlag.Passive))
{
    var clrInfo = dataTarget.ClrVersions[0];
}

Ex:

System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index') at System.SZArrayHelper.get_Item[T](Int32 index) 

Code (2.x):

using (var dataTarget = DataTarget.AttachToProcess(processId, suspend: false))
{
    var clrInfo = dataTarget.ClrVersions[0];
}

From brief investigation I think that the issue is how ClrMD detects the runtime. It enumerates through modules and checks the file names there to determine the runtime, so I wrote a code that writes down the list of modules and here are the results:

Windows:

ModuleName: Raven.Server.exe. ModuleFileName: C:\workspaces\HR\ravendb_4\artifacts\windows-x64\Server\Raven.Server.exe
ModuleName: coreclr.dll. ModuleFileName: C:\Users\ppekr\AppData\Local\Temp\.net\Raven.Server\g2qch1ov.omw\coreclr.dll
ModuleName: clrjit.dll. ModuleFileName: C:\Users\ppekr\AppData\Local\Temp\.net\Raven.Server\g2qch1ov.omw\clrjit.dll
ModuleName: icu.dll. ModuleFileName: C:\Windows\SYSTEM32\icu.dll
ModuleName: winrnr.dll. ModuleFileName: C:\Windows\System32\winrnr.dll
ModuleName: wshbth.dll. ModuleFileName: C:\Windows\system32\wshbth.dll
ModuleName: pnrpnsp.dll. ModuleFileName: C:\Windows\system32\pnrpnsp.dll
ModuleName: napinsp.dll. ModuleFileName: C:\Windows\system32\napinsp.dll
ModuleName: librvnpal.win.x64.dll. ModuleFileName: C:\Users\ppekr\AppData\Local\Temp\.net\Raven.Server\g2qch1ov.omw\librvnpal.win.x64.dll
ModuleName: wshunix.dll. ModuleFileName: C:\Windows\system32\wshunix.dll
ModuleName: NLAapi.dll. ModuleFileName: C:\Windows\system32\NLAapi.dll
ModuleName: apphelp.dll. ModuleFileName: C:\Windows\SYSTEM32\apphelp.dll
ModuleName: kernel.appcore.dll. ModuleFileName: C:\Windows\SYSTEM32\kernel.appcore.dll
ModuleName: ntmarta.dll. ModuleFileName: C:\Windows\SYSTEM32\ntmarta.dll
ModuleName: IPHLPAPI.DLL. ModuleFileName: C:\Windows\SYSTEM32\IPHLPAPI.DLL
ModuleName: DNSAPI.dll. ModuleFileName: C:\Windows\SYSTEM32\DNSAPI.dll
ModuleName: mswsock.dll. ModuleFileName: C:\Windows\System32\mswsock.dll
ModuleName: BCrypt.dll. ModuleFileName: C:\Windows\System32\BCrypt.dll
ModuleName: win32u.dll. ModuleFileName: C:\Windows\System32\win32u.dll
ModuleName: KERNELBASE.dll. ModuleFileName: C:\Windows\System32\KERNELBASE.dll
ModuleName: gdi32full.dll. ModuleFileName: C:\Windows\System32\gdi32full.dll
ModuleName: msvcp_win.dll. ModuleFileName: C:\Windows\System32\msvcp_win.dll
ModuleName: bcryptPrimitives.dll. ModuleFileName: C:\Windows\System32\bcryptPrimitives.dll
ModuleName: ucrtbase.dll. ModuleFileName: C:\Windows\System32\ucrtbase.dll
ModuleName: shcore.dll. ModuleFileName: C:\Windows\System32\shcore.dll
ModuleName: NSI.dll. ModuleFileName: C:\Windows\System32\NSI.dll
ModuleName: msvcrt.dll. ModuleFileName: C:\Windows\System32\msvcrt.dll
ModuleName: RPCRT4.dll. ModuleFileName: C:\Windows\System32\RPCRT4.dll
ModuleName: ole32.dll. ModuleFileName: C:\Windows\System32\ole32.dll
ModuleName: combase.dll. ModuleFileName: C:\Windows\System32\combase.dll
ModuleName: SHELL32.dll. ModuleFileName: C:\Windows\System32\SHELL32.dll
ModuleName: KERNEL32.DLL. ModuleFileName: C:\Windows\System32\KERNEL32.DLL
ModuleName: ADVAPI32.dll. ModuleFileName: C:\Windows\System32\ADVAPI32.dll
ModuleName: shlwapi.dll. ModuleFileName: C:\Windows\System32\shlwapi.dll
ModuleName: sechost.dll. ModuleFileName: C:\Windows\System32\sechost.dll
ModuleName: IMM32.DLL. ModuleFileName: C:\Windows\System32\IMM32.DLL
ModuleName: ws2_32.dll. ModuleFileName: C:\Windows\System32\ws2_32.dll
ModuleName: USER32.dll. ModuleFileName: C:\Windows\System32\USER32.dll
ModuleName: OLEAUT32.dll. ModuleFileName: C:\Windows\System32\OLEAUT32.dll
ModuleName: GDI32.dll. ModuleFileName: C:\Windows\System32\GDI32.dll
ModuleName: ntdll.dll. ModuleFileName: C:\Windows\SYSTEM32\ntdll.dll

As you can see the coreclr.dll module has a path and it is in temp (self-extract feature of single file applications).

Linux:

ModuleName: Raven.Server. ModuleFileName: /home/ppekrol/Workspaces/ravendb/artifacts/linux-x64/ServerPackage/RavenDB/Server/Raven.Server
ModuleName: Raven.voron. ModuleFileName: /home/ppekrol/Workspaces/ravendb/artifacts/linux-x64/ServerPackage/RavenDB/Server/RavenData/System/Raven.voron
ModuleName: scratch.0000000000.buffers. ModuleFileName: /home/ppekrol/Workspaces/ravendb/artifacts/linux-x64/ServerPackage/RavenDB/Server/RavenData/System/Temp/scratch.0000000000.buffers
ModuleName: compression.0000000000.buffers. ModuleFileName: /home/ppekrol/Workspaces/ravendb/artifacts/linux-x64/ServerPackage/RavenDB/Server/RavenData/System/Temp/compression.0000000000.buffers
ModuleName: libnss_dns-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libnss_dns-2.31.so
ModuleName: libnss_mdns4_minimal.so.2. ModuleFileName: /usr/lib/x86_64-linux-gnu/libnss_mdns4_minimal.so.2
ModuleName: librvnpal.linux.x64.so. ModuleFileName: /var/tmp/.net/ppekrol/Raven.Server/nlzfandb.ume/librvnpal.linux.x64.so
ModuleName: libnss_files-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libnss_files-2.31.so
ModuleName: metaZones.res. ModuleFileName: /usr/share/zoneinfo-icu/44/le/metaZones.res
ModuleName: timezoneTypes.res. ModuleFileName: /usr/share/zoneinfo-icu/44/le/timezoneTypes.res
ModuleName: zoneinfo64.res. ModuleFileName: /usr/share/zoneinfo-icu/44/le/zoneinfo64.res
ModuleName: libicui18n.so.66.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libicui18n.so.66.1
ModuleName: libicudata.so.66.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libicudata.so.66.1
ModuleName: libicuuc.so.66.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libicuuc.so.66.1
ModuleName: libcrypto.so.1.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
ModuleName: libssl.so.1.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libssl.so.1.1
ModuleName: liburcu-bp.so.6.1.0. ModuleFileName: /usr/lib/x86_64-linux-gnu/liburcu-bp.so.6.1.0
ModuleName: liblttng-ust-tracepoint.so.0.0.0. ModuleFileName: /usr/lib/x86_64-linux-gnu/liblttng-ust-tracepoint.so.0.0.0
ModuleName: libresolv-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libresolv-2.31.so
ModuleName: libkeyutils.so.1.8. ModuleFileName: /usr/lib/x86_64-linux-gnu/libkeyutils.so.1.8
ModuleName: libkrb5support.so.0.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1
ModuleName: libcom_err.so.2.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libcom_err.so.2.1
ModuleName: libk5crypto.so.3.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1
ModuleName: libkrb5.so.3.3. ModuleFileName: /usr/lib/x86_64-linux-gnu/libkrb5.so.3.3
ModuleName: libc-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libc-2.31.so
ModuleName: libgcc_s.so.1. ModuleFileName: /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
ModuleName: libm-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libm-2.31.so
ModuleName: libstdc++.so.6.0.28. ModuleFileName: /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28
ModuleName: librt-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/librt-2.31.so
ModuleName: libgssapi_krb5.so.2.2. ModuleFileName: /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2.2
ModuleName: libz.so.1.2.11. ModuleFileName: /usr/lib/x86_64-linux-gnu/libz.so.1.2.11
ModuleName: libdl-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libdl-2.31.so
ModuleName: libpthread-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
ModuleName: ld-2.31.so. ModuleFileName: /usr/lib/x86_64-linux-gnu/ld-2.31.so

There is no libcoreclr.so on that list and this is essential to detect the runtime. Any ideas how I can workaround it? Can it be fixed (preferably in 1.x) ?

Interesting is that my custom native library (that was embedded) is on that list: librvnpal.linux.x64.so. Not sure if this can be detected by path, we stumbled upon a similar issue with assemblies location and metadata reference creation when we moved to Single File Application and we ended up with following code: https://github.com/ravendb/ravendb/blob/4da999402a54591572e6ed74ce0fb1791f90bc83/src/Raven.Server/Documents/Indexes/Static/IndexCompiler.cs#L118-L124 Previously we were doing that directly via file path: https://github.com/ravendb/ravendb/blob/0c1f169b255f9e5ab5e3fa8fc6f90c4dafee7d69/src/Raven.Server/Documents/Indexes/Static/IndexCompiler.cs#L100 (which is string.Empty in Single File Applications)

My app, that I want to attach to, is compiled using following command:

dotnet publish -c Release -r linux-x64 /p:PublishSingleFile=true /p:IncludeNativeLibrariesForSelfExtract=true

Tried to set IncludeNativeLibrariesForSelfExtract to false but it only affects my libraries, not the framework ones, so I do not think it matters here.

Many thanks in advance for help!

nxtn commented 3 years ago

FYI CoreCLR is statically linked into single-file apps on Unix platforms.

https://github.com/dotnet/designs/blob/main/accepted/2020/single-file/design.md

ppekrol commented 3 years ago

Interesting. That would explain a lot. @NextTurn do you know maybe if there is a workaround for that problem that we could apply at the moment?

leculver commented 3 years ago

Single file applications are not something I plan to support in the near future.

The primary concern is ongoing maintenance costs of keeping it working vs how many people are currently using the feature. If more people start using it (and asking for support in ClrMD) I may reconsider.

mikem8361 commented 3 years ago

dotnet-dump/SOS will need it over the next few months. I may have some time to implement it over the holidays.

leculver commented 3 years ago

That works too. =)

leculver commented 3 years ago

@mikem8361 To be clear, the number 1 thing that needs to happen for single file is adding tests to ClrMD so it doesn't regress. The actual implementation of it I don't expect to be too difficult. The challenge is building a sensible repro program we can use that's single-file that the current tests on linux can build. If that problem is solved I'm not too worried about maintaining it.

ppekrol commented 3 years ago

I've tried bypassing the issue by loading the libcoreclr.so using

        [DllImport("libdl.so")]
        private static extern IntPtr dlopen(string filename, int flags);

I see that the module now is detected properly after attaching, but unfortunately it fails during runtime creation with:

Microsoft.Diagnostics.Runtime.ClrDiagnosticsException: Failure loading DAC: CreateDacInstance failed 0x80131c4e
   at Microsoft.Diagnostics.Runtime.DacLibrary..ctor(DataTarget dataTarget, String dacDll)
   at Microsoft.Diagnostics.Runtime.DataTarget.ConstructRuntime(ClrInfo clrInfo, String dac)
   at Microsoft.Diagnostics.Runtime.DataTarget.CreateRuntime(ClrInfo clrInfo, String dacFilename, Boolean ignoreMismatch)
   at Microsoft.Diagnostics.Runtime.ClrInfo.CreateRuntime(String dacFilename, Boolean ignoreMismatch)

So I do not think this is the way to go with a workaround (or I'm missing something).

Anyway, @leculver the simplest .NET 5 console app fails after publishing using dotnet publish -c Release -r linux-x64 /p:PublishSingleFile=true

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net5.0</TargetFramework>
  </PropertyGroup>

</Project>

public static void Main(string[] args)
{
    using (var process = Process.GetCurrentProcess())
    {
        Console.WriteLine($"Hello World: {process.Id}");
        Console.WriteLine("Press any key to close...");
    }

    Console.ReadLine();
}

and attaching to it using:

using (var dataTarget = DataTarget.AttachToProcess(processId, attachTimeout, AttachFlag.Passive))
{
    var clrInfo = dataTarget.ClrVersions[0];
}

Hope that it helps.

AndreGleichner commented 2 years ago

It now also fails with .Net 6 on Windows. Single file apps in .Net 6 are statically linked to the clr on Windows too.

ppekrol commented 2 years ago

Hi @leculver

Any plans to support this?

leculver commented 2 years ago

AFAIK this is actually due to an issue with create-dump, not with ClrMD. See this issue: https://github.com/dotnet/diagnostics/issues/3065.

When I last debugged this, there was not enough data being placed into the crash dump for any debugger to make sense of single-file dumps (ClrMD included). I'm happy to take a closer look if you find that's not the case.

leculver commented 2 years ago

I will double check the linux-part of this issue. The linked bug is only for the windows side of things.

ayende commented 2 years ago

FWIW - we don't care that much about looking at a dump file. We want to be able to attach to a running process as the primary concern here.

ppekrol commented 1 year ago

I will double check the linux-part of this issue. The linked bug is only for the windows side of things.

Hi @leculver

Have you managed to take a peak? Is there a difference between reading a dump file or attaching to an existing process? As @ayende mentioned, we are more interested in attaching, we are using this to display the stacks of the process.