createdump on AKS + Linux doesn't work correctly unless running in a shell

peter-glotfelty commented 2 years ago

Description

We're a MSFT internal team using dotnet on linux in AKS. We're onboarded to Azure Watson and some of our teams are looking to migrate to distroless containers for our applications. We have noticed that collecting core dumps on crashes does not work correctly unless we start dotnet inside a shell script, aka this doesn't generate the correct core dumps:

ENTRYPOINT ["dotnet", "MyApp.dll"]

however this does:

ENTRYPOINT ["/bin/bash", "start.sh"]

where start.sh is a small wrapper that forwards signals and calls dotnet itself.

We're expecting 2 core dumps to be taken:

One by dotnet using createdump
One by the kernel itself

However, if we don't run with a shell, we typically see 0 dumps when the app crashes (we do occasionally see 1 dump, which is weird).

Reproduction Steps

The app code seems to be irrelevant, anything that crashes the process seems to work:

namespace MyApplication {
    using System;
    using System.Threading;

    public class Program
    {
        public static void Main(string[] args)
        {
            Thread.Sleep(5000);
            throw new Exception("Crashed on startup");
        }
    }
}

Then a dockerfile like this one:

FROM mcr.microsoft.com/dotnet/runtime:6.0-bullseye-slim
ARG source=./ 
WORKDIR /app

# This is just the result of `dotnet publish`
COPY $source .
ENTRYPOINT ["dotnet", "MyApplication.dll"]

We then deploy it to AKS in a pod with COMPlus_DbgEnableMiniDump=1 and COMPlus_DbgMiniDumpType=4, and the azure-watson agent running on the node.

Expected behavior

createdump takes a dump of the managed heap and the watson agent finds it and uploads it to the portal.

Actual behavior

Generally, no dump appears. Occassionally, a dump will show up without symbols. One thing that is a little notable is that these dumps typically are in a SIGSEGV bucket, whereas dumps that come from containers running from a shell are almost always in a SIGABRT bucket.

Regression?

n/a

Known Workarounds

As mentioned above. The issues only comes up if we are running dotnet as the entrypoint in our container. If the entrypoint is a bash script that starts dotnet. Everything works as expected, and our core dumps are properly taken and uploaded. We haven't tried other shells or init exe's

Configuration

We're using mcr.microsoft.com/dotnet/runtime:6.0-bullseye-slim and I believe we've seen this with the aspnet version as well.

Host OS : Ubuntu 18.04.6 LTS
Kernel version: 5.4.0-1078-azure Arch: x64

Other information

No response

ghost commented 2 years ago

Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.

Issue Details

### Description We're a MSFT internal team using dotnet on linux in AKS. We're onboarded to Azure Watson and some of our teams are looking to migrate to distroless containers for our applications. We have noticed that collecting core dumps on crashes does not work correctly unless we start dotnet inside a shell script, aka this doesn't generate the correct core dumps: ```dockerfile ENTRYPOINT ["dotnet", "MyApp.dll"] ``` however this does: ```dockerfile ENTRYPOINT ["/bin/bash", "start.sh"] ``` where `start.sh` is a small wrapper that forwards signals and calls `dotnet` itself. We're expecting 2 core dumps to be taken: - One by dotnet using createdump - One by the kernel itself However, if we don't run with a shell, we typically see 0 dumps when the app crashes (we do occasionally see 1 dump, which is weird). ### Reproduction Steps The app code seems to be irrelevant, anything that crashes the process seems to work: ```csharp namespace MyApplication { using System; using System.Threading; public class Program { public static void Main(string[] args) { Thread.Sleep(5000); throw new Exception("Crashed on startup"); } } } ``` Then a dockerfile like this one: ```dockerfile FROM mcr.microsoft.com/dotnet/runtime:6.0-bullseye-slim ARG source=./ WORKDIR /app # This is just the result of `dotnet publish` COPY $source . ENTRYPOINT ["dotnet", "MyApplication.dll"] ``` We then deploy it to AKS in a pod with `COMPlus_DbgEnableMiniDump=1` and `COMPlus_DbgMiniDumpType=4`, and the azure-watson agent running on the node. ### Expected behavior createdump takes a dump of the managed heap and the watson agent finds it and uploads it to the portal. ### Actual behavior Generally, no dump appears. Occassionally, a dump will show up without symbols. One thing that is a little notable is that these dumps typically are in a SIGSEGV bucket, whereas dumps that come from containers running from a shell are almost always in a SIGABRT bucket. ### Regression? n/a ### Known Workarounds As mentioned above. The issues only comes up if we are running dotnet as the entrypoint in our container. If the entrypoint is a bash script that starts dotnet. Everything works as expected, and our core dumps are properly taken and uploaded. We haven't tried other shells or init exe's ### Configuration We're using `mcr.microsoft.com/dotnet/runtime:6.0-bullseye-slim` and I believe we've seen this with the `aspnet` version as well. Host OS : `Ubuntu 18.04.6 LTS` Kernel version: `5.4.0-1078-azure` Arch: `x64` ### Other information _No response_

Author:	peter-glotfelty
Assignees:	-
Labels:	`area-Diagnostics-coreclr`, `untriaged`
Milestone:	-

hoyosjs commented 2 years ago

Is there any particular user you're using for this? Also, what's the CRI implementation and OCI runtime? Is it containerd/runc? Under regular docker it works fine, so my first hunch is any access issues - potentially because the user isn't PTRACE enabled or because the seccomp profile disallows it.

peter-glotfelty commented 2 years ago

Hello @hoyosjs, the app runs as the default user in the mcr image which I assume is root. We're using containerd + runc:

$ /usr/bin/containerd --version
containerd github.com/containerd/containerd 1.5.11+azure-1 3df54a852345ae127d1fa3092b95168e4a88e2f8

$ /usr/bin/runc --version
runc version 1.0.3
commit: f46b6ba2c9314cfc8caae24a32ec5fe9ef1059fe
spec: 1.0.2-dev
go: go1.16.12
libseccomp: 2.5.1

peter-glotfelty commented 2 years ago

Did a little more investigating of this issue:

Adding PTRACE to the container spec doesn't change anything as best I can tell.
```
securityContext:
   capabilities:
      add:
      - SYS_PTRACE
```
I notice that if dotnet is not running in a shell, it looks like it runs createdump twice (we don't get either though 😔):

[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 00000006
[createdump] Writing full dump to file /watson/cores/<process-name>-1
[createdump] Written 121438208 bytes (29648 pages) to core file
[createdump] Dump successfully written
[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 0000000b
[createdump] Writing full dump to file /watson/cores/<process-name>-1
[createdump] Written 121438208 bytes (29648 pages) to core file
[createdump] Dump successfully written

However, if it crashes in a start.sh script, it only prints that code out once.

[createdump] Gathering state for process 7 dotnet
[createdump] Crashing thread 00000007 signal 00000006
[createdump] Writing full dump to file /watson/cores/<process-name>-7
[createdump] Written 121442304 bytes (29649 pages) to core file
[createdump] Dump successfully written
/app/start.sh: line 19:     7 Aborted                 (core dumped) dotnet <process>.dll

This might be a logging issue and nothing else, but it seems like it's maybe worth mentioning.

mikem8361 commented 2 years ago

Are you exporting COMPlus_DbgMiniDumpName? Is /watson/cores/<process-name>-1 the actually core dump file path? Or was it edited to remove the actual process name?

Not sure why createdump is being run twice for the same process.

FYI, the /app/start.sh: line 19: 7 Aborted (core dumped) dotnet <process>.dll message is from enable system core dumps.

ghost commented 2 years ago

This issue has been marked needs-author-action and may be missing some important information.

peter-glotfelty commented 2 years ago

We are setting COMPlusDbgMiniDumpName in the pod spec, and yes, I did format the above lines to remove the actual application names.

The full name includes the container and pod name as requested by the Azure Watson folks in this document

/watson/cores/$(CONTAINER_NAME)_$(POD_NAME)()$(CONTAINER_NAME)-%d

*Note: The empty() are intentional and suggested in the aforementioned document.

mikem8361 commented 2 years ago

The reason I'm asking is that maybe the coredump isn't being written because of some invalid char in the name or path, but Linux has very few invalid file name chars and it should fail to open. I didn't see where that doc recommended the () in the name. Could you try a simpler file name (no ())? You may want to use %e for process id instead of %d (this is supported for backwards compatibility).

I did see this:

  - name: COMPlus_DbgMiniDumpName
    value: "/cores/$(CONTAINER_POD_NAME_KEY)$(CONTAINER_NAME_SEPARATOR)<application name>-%d"

mikem8361 commented 2 years ago

Oops, I didn't see what CONTAINER_NAME_SEPARATOR was defined to ().

peter-glotfelty commented 2 years ago

We can definitely switch to %e as part of general housekeeping. I think we onboarded when most of our services were still on 3.1, and the guidance may have been different.

Just to check nothing weird was happening with the names, I changed () to __, and I still see the issue so I don't think it's the naming convention.

mikem8361 commented 2 years ago

If you are still on 3.1 (I assumed 6.0), then you should continue use %d.

Later .NET Core versions do have better diagnostic logging. The only thing I can come up with on this containerd/runc issue, is that the createdump process doesn't have sufficient permissions to write the dump to the target directory even though the dump open/writes don't fail. I'm grasping at straws here because I don't know this container stuff.

peter-glotfelty commented 2 years ago

We are on 6.0 now. (Sorry for the confusion)

I'm also not sure what else we can add. I added the COMPlus_CreateDumpDiagnostics and created a few more dumps; I'll attach the logs below, but I don't think there's a smoking gun there either. Interesting-ish snippets

// Broken Dumps
// .... lots of stuff before
[createdump] MODULE: 00007f283daa0000 dyn 0 inmem 0 file 0 pe 000056389bf21260 pdb 0000000000000000[createdump] MODULE: timestamp bc25072e size 00058c00 869c333324504c78a0e7cd8cde34b6ac /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.7/System.Memory.dll
[createdump] MODULE: 00007f28b2f2d000 dyn 0 inmem 0 file 1 pe 000056389bf2e010 pdb 0000000000000000[createdump] MODULE: timestamp 81f69e3c size 00008000 0ddc11c76170457a97cb32b070521b69 /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.7/System.Text.Encoding.Extensions.dll
[createdump] EnumerateManagedModules: Module enumeration FINISHED
[createdump] Unwind: thread 0001
[createdump] GetMemoryRegionFlags: FAILED
[createdump] Unwind: managed frames
[createdump] Unwind: found managed exception
[createdump] Unwind: exception object 0x7f28140096a8 exception hresult 80131500
[createdump] Unwind: exception type System.Exception
[createdump] GetMemoryRegionFlags: FAILED
[createdump] Unwind: thread 0008
[createdump] Unwind: thread 0009
[createdump] Unwind: thread 000a
[createdump] Unwind: thread 000b
[createdump] Unwind: thread 000c
[createdump] Unwind: managed frames
[createdump] Unwind: thread 000e
[createdump] Unwind: managed frames
[createdump] CombineMemoryRegions: STARTED
[createdump] CombineMemoryRegions: FINISHED
[createdump] Writing full dump to file /watson/cores/telemetry-test-driver_telemetry-test-driver-599b8d654b-g8fjh()telemetry-test-driver-1
[createdump] Writing memory region headers to core file
[createdump] Writing process information to core file
[createdump] Writing 20 auxv entries to core file
[createdump] Writing 141 NT_FILE entries to core file
[createdump] Writing 7 thread entries to core file
[createdump] Writing 169 memory regions to core file
[createdump] Written 121458688 bytes (29653 pages) to core file
[createdump] Dump successfully written
[createdump] Gathering state for process 1 dotnet // <---- Goes immediately into taking a second dump.
[createdump] Crashing thread 00000001 signal 0000000b
[createdump] Thread 0001 RIP 00007f28b6f12207 RSP 00007f28b73654f0
[createdump] Thread 0008 RIP 00007f28b6f3a3ff RSP 00007f28b6637da0
// .... lots more stuff
[createdump] MODULE: 00007f283daa0000 dyn 0 inmem 0 file 0 pe 000056389bf21260 pdb 0000000000000000[createdump] MODULE: timestamp bc25072e size 00058c00 869c333324504c78a0e7cd8cde34b6ac /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.7/System.Memory.dll
[createdump] MODULE: 00007f28b2f2d000 dyn 0 inmem 0 file 1 pe 000056389bf2e010 pdb 0000000000000000[createdump] MODULE: timestamp 81f69e3c size 00008000 0ddc11c76170457a97cb32b070521b69 /usr/share/dotnet/shared/Microsoft.NETCore.App/6.0.7/System.Text.Encoding.Extensions.dll
[createdump] EnumerateManagedModules: Module enumeration FINISHED
[createdump] Unwind: thread 0001
[createdump] Unwind: managed frames // <--- Skips "GetMemoryRegionFlags the second time"
[createdump] Unwind: found managed exception
[createdump] Unwind: exception object 0x7f28140096a8 exception hresult 80131500
[createdump] Unwind: exception type System.Exception
[createdump] Unwind: thread 0008
[createdump] Unwind: thread 0009
[createdump] Unwind: thread 000a
[createdump] Unwind: thread 000b
[createdump] Unwind: thread 000c
[createdump] Unwind: managed frames
[createdump] Unwind: thread 000e
[createdump] Unwind: managed frames
[createdump] CombineMemoryRegions: STARTED
[createdump] CombineMemoryRegions: FINISHED
[createdump] Writing full dump to file /watson/cores/telemetry-test-driver_telemetry-test-driver-599b8d654b-g8fjh()telemetry-test-driver-1
[createdump] Writing memory region headers to core file
[createdump] Writing process information to core file
[createdump] Writing 20 auxv entries to core file
[createdump] Writing 141 NT_FILE entries to core file
[createdump] Writing 7 thread entries to core file
[createdump] Writing 169 memory regions to core file
[createdump] Written 121458688 bytes (29653 pages) to core file
[createdump] Dump successfully written

createdumplogs-without-errors.txt createdumplogs-with-errors.txt

tommcdon commented 2 years ago

Hi @peter-glotfelty sorry for the delay on this issue. It seems there are two issues in this issue:

Containers with no shell (e.g. /bin/bash) does not execute createdump (presumably this is because the right env variables are not set?)
When the container is running with a shell, createdump is executed twice. As far as we know there is no code in the runtime that could explain this behavior. Would it be possible to share the start.sh script?

peter-glotfelty commented 2 years ago

When the container is running with a shell, createdump is executed twice. As far as we know there is no code in the runtime that could explain this behavior. Would it be possible to share the start.sh script?

Not quite, I only see this behavior when the container is running without a shell. Basically, it's another symptom we see in addition to your first bullet point so I suspect they are related.

Our startup script is pretty simple:

#!/bin/bash

_term() {
  kill "$child"
}

# We need to make sure that when k8s terminates the pod, we stop
# the child process
trap _term TERM

# Original Entrypoint.
dotnet TelemetryTestService.dll &

child="$!"
wait "$!"

hoyosjs commented 1 year ago

@peter-glotfelty

I tried reproing this under AKS on an Ubuntu node with the following settings:

Container recipe:

FROM mcr.microsoft.com/dotnet/runtime:6.0-bullseye-slim
ARG source=./ 
WORKDIR /app

# This is just the result of `dotnet publish`
COPY $source .
ENTRYPOINT ["dotnet", "MyApplication.dll"]

Container spec:

      containers:
      - name: consoleapp
        image: hoyosjs.azurecr.io/dotnetdump/container-shell:entry
        env:
        - name: COMPlus_DbgEnableMiniDump
          value: "1"

And I get:

juhoyosa@TARDIS-DEV::publish> kubectl logs consoleapp-deployment-5b685b47cf-bqd26
Unhandled exception. System.Exception: Crashed on startup
   at MyApplication.Program.Main(String[] args) in /home/mikem/builds/dockertest/Program.cs:line 10
[createdump] Gathering state for process 1 dotnet
[createdump] Crashing thread 00000001 signal 00000006
[createdump] Writing minidump with heap to file /tmp/coredump.1
[createdump] Written 62087168 bytes (15158 pages) to core file
[createdump] Target process is alive
[createdump] Dump successfully written

With a dotnet entry point. So somethings seems to be different. I couldn't use mariner since I'd have to use a DaemonSet to work around https://github.com/dotnet/diagnostics/issues/3423. The AKS team is working on deploying the containerd fix now. Do you think it could be related to that?

mikem8361 commented 1 year ago

No longer repro.

dotnet / runtime