Necessary permissions in docker using a rootless user for dotnet-* tools #2102

DOMZE opened 3 years ago

DOMZE commented 3 years ago


I'm using the dotnet-dump / dotnet-trace within a docker container, engine running in WSL2, using a rootless user.

dotnet-dump In my docker file, I added CAP_SYS_PTRACE capabilities to createdump using setcap CAP_SYS_PTRACE=+eip $(find /usr/share -name createdump) I also start my container with --cap-add=SYS_PTRACE

Once I did that I was able to get passed the error Core dump generation FAILED 0x80004005 due to PTrace(ATTACH, 1) FAILED Operation not permitted

However now I'm getting the error /usr/share/dotnet/shared/Microsoft.NETCore.App/5.0.4/createdump: error while loading shared libraries: cannot open shared object file: No such file or directory

chowing recursively the directory /usr/share/dotnet/shared/Microsoft.NETCore.App to a group where my user is in also has no impact.

is root absolutely necessary to create dumps? If this is the case, is running a side-car container my only option to be able to generate dumps if my main application container runs under a rootless user?

dotnet-trace Using the same rootless user, if I try to create a trace, I get the following error:

No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

[ERROR] System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.Diagnostics.Tools.Trace.CollectCommandHandler.Collect(CancellationToken ct, IConsole console, Int32 processId, FileInfo output, UInt32 buffersize, String providers, String profile, TraceFileFormat format, TimeSpan duration, String clrevents, String clreventlevel, String name, String diagnosticPort) in /_/src/Tools/dotnet-trace/CommandLine/Commands/CollectCommand.cs:line 163

what permissions is necessary to capture tracing using a rootless user?

Thank you!

mikem8361 commented 3 years ago

dotnet-dump should only need the SYS_PTRACE capability (--cap-add=SYS_PTRACE should be enough). I'm not sure what your setcap command does exactly. Can you send what ls -l (should display the user of each file) in /usr/share/dotnet/shared/Microsoft.NETCore.App/5.0.4/? For some reason, the loader can't find It is a shared module reference by createdump.

DOMZE commented 3 years ago

The setcap command with CAP_SYS_PTRACE adds the capability for a process to trace arbitrary processes using ptrace. If i don't setcap, I automatically get Core dump generation FAILED 0x80004005 due to PTrace(ATTACH, 1) FAILED Operation not permitted

Running either of those 2 commands fail with

/usr/share/dotnet/shared/Microsoft.NETCore.App/5.0.4/createdump: error while loading shared libraries: cannot open shared object file: No such file or directory

chown -R root:testuser /usr/share/dotnet/shared chown -R testuser:testuser /usr/share/dotnet/shared

The output below is the output of the last command

Note If I run the container as root (no USER directive), the dotnet-dump command works successfully. (also dont need to setcap as i'm root)

For information, i'm using image as final build stage image

mikem8361 commented 3 years ago

As far as I know all you should need is the --cap-add=SYS_PTRACE (or --privileged) when starting the docker container. You shouldn't need the setcap or setting /usr/shared/dotnet/shared to your testuser. I'm not an expect on docker and maybe the 5.0 image is different somehow that what we are using for testing in the diagnostics repo's CI builds.

/cc: @shirhatti

DOMZE commented 3 years ago

@mikem8361 you can easily reproduce the problem using the following:

In a command line:

mkdir DotnetDiagToolsBug
dotnet new mvc
touch Dockerfile


#See to understand how Visual Studio uses this Dockerfile to build your images for faster debugging.

FROM AS base

FROM AS build
COPY ["DotnetDiagToolsBug.csproj", "."]
RUN dotnet restore "./DotnetDiagToolsBug.csproj"
COPY . .
WORKDIR "/src/."
RUN dotnet build "DotnetDiagToolsBug.csproj" -c Release -o /app/build

# dotnet tools are currently available as part of SDK so we need to create them in an sdk image
# and copy them to our final runtime image
FROM AS tools-install
RUN dotnet tool install --tool-path /dotnetcore-tools dotnet-trace
RUN dotnet tool install --tool-path /dotnetcore-tools dotnet-dump
RUN dotnet tool install --tool-path /dotnetcore-tools dotnet-gcdump
RUN dotnet tool install --tool-path /dotnetcore-tools dotnet-counters

FROM build AS publish
RUN dotnet publish "DotnetDiagToolsBug.csproj" -c Release -o /app/publish

# add the testuser system group
RUN groupadd --system --gid 10101 testuser
# add the testuser system user, without a password and without a login shell with the testuser group created before
RUN adduser --system --disabled-password --shell /sbin/nologin --home /testuser --uid 10101 --ingroup testuser testuser
# set owner for our source and built app to the created user and group.
RUN chown -R testuser:testuser /app/publish

FROM base AS final
ARG DEBIAN_FRONTEND=noninteractive

# update system and install necessary packages
RUN apt-get update && apt-get install -y \
  libcap2-bin \
  && rm -rf /var/lib/apt/lists/*

COPY --from=publish /app/publish .
COPY --from=publish /etc/group /etc/passwd /etc/
COPY --from=tools-install /dotnetcore-tools /opt/dotnetcore-tools

# set the privileges for dotnet to run on privileged ports
# flags: e=activated , p=permitted
RUN dotnetBinary=$(which dotnet) && setcap cap_net_bind_service=+ep $(readlink -f $dotnetBinary)
ENV PATH "$PATH:/opt/dotnetcore-tools"

USER testuser
ENTRYPOINT ["dotnet", "DotnetDiagToolsBug.dll"]

Build and run the image:

docker build -t dotnettoolsbug .
docker run -p 5000:80 --name dotnettoolsbug --cap-add=SYS_PTRACE dotnettoolsbug

in another window:

docker exec -it dotnettoolsbug bash

in the container:

cd /tmp && dotnet-dump collect --process-id 1


Writing full to /tmp/core_20210323_194228
Writing dump failed (HRESULT: 0x80004005)

In the app logs output:

Gathering state for process 1 dotnet
ptrace(ATTACH, 1) FAILED Operation not permitted
hoyosjs commented 3 years ago

The ptrace attach part fails because there's no ambient capability transfer to the other user. Once you manually set it at the file level it works, but you'll see there's loading issues due to other CAP issues. Also, the setuid calls used by docker seems to mess the state of the dotnet-process. dotnet-dump won't work as a side car as the one that generates the dump is still the target process. I could not find a way to work around these issues

The dotnet-trace bug is a bit suspicious as I don't know how something would be null here: Not sure why things are null but I can't attach a debugger.

DOMZE commented 3 years ago

@hoyosjs what is your recommendation then?

hoyosjs commented 3 years ago

Sorry, I this got buried in notifications @DOMZE. Currently I have no good recommendations. I will be experimenting with user namespaces, at which point a lot of the security concerns from using containers will be mitigated. As for getting full de-escalation from things like yum/dnf/apt installations and port concerns, I don't have any clear guidance. Maybe I'll try to reach out to see what causes this behavior.

bluetentacle commented 3 years ago

We too have long been troubled by this issue. dotnet-dump is effectively useless to us as a production troubleshooting tool, because we cannot run services as root in production.