grafana / pyroscope

Continuous Profiling Platform. Debug performance issues down to a single line of code
https://grafana.com/oss/pyroscope/
GNU Affero General Public License v3.0
9.87k stars 585 forks source link

.NET 6 profiling crash process #888

Closed ArtemTrofimushkin closed 1 year ago

ArtemTrofimushkin commented 2 years ago

Hi! Thanks for great project! We tries to use pyroscope for continuous profiling for our .net apps, running in K8S. Pyroscope used as side car container, for application container, both containers shares /tmp directory with .net sockets. Everything works fine for .net core 3.1, .net 5.0, but when we started to migrate on .net 6, we observed that application container crashes every ~ 5 mins (approx)

Here is example pod manifest to reproduce this behaviour:

apiVersion: v1
kind: Pod
metadata:
  name: pyroscope-reproduce-net-6-faillure
spec:
  containers:
  - env:
    - name: COMPlus_DbgEnableMiniDump
      value: "1"
    - name: COMPlus_CreateDumpDiagnostics
      value: "1"
    image: mcr.microsoft.com/dotnet/samples:aspnetapp
    livenessProbe:
      httpGet:
        path: /
        port: http
      initialDelaySeconds: 5
    name: web
    ports:
    - containerPort: 80
      name: http
    resources:
      limits:
        cpu: "1"
        memory: 1G
      requests:
        cpu: 100m
        memory: 500M
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /tmp
      name: dotnet-profiling
  - command:
    - /bin/sh
    - -c
    - sleep 5s; pyroscope connect --pid 1 --tag namespace=$NAMESPACE
    env:
    - name: PYROSCOPE_SERVER_ADDRESS
      value: http://pyroscope.infra-pyroscope.svc.cluster.local:4040
    - name: PYROSCOPE_SPY_NAME
      value: dotnetspy
    - name: PYROSCOPE_LOG_LEVEL
      value: debug
    - name: PYROSCOPE_APPLICATION_NAME
      value: dodo.pyro-test.web
    - name: PYROSCOPE_SAMPLE_RATE
      value: "100"
    image: pyroscope/pyroscope:0.10.2
    name: pyroscope
    resources:
      limits:
        cpu: 200m
        memory: 256Mi
      requests:
        cpu: 100m
        memory: 128Mi
    securityContext:
      capabilities:
        add:
        - SYS_PTRACE
      runAsUser: 0
    terminationMessagePolicy: FallbackToLogsOnError
    volumeMounts:
    - mountPath: /tmp
      name: dotnet-profiling
  nodeSelector:
    kubernetes.io/os: linux
  serviceAccountName: default
  volumes:
  - name: dotnet-profiling

Pod uses aspnet core sample application, taken from here. I've added COMPlus_DbgEnableMiniDump environment variables, that instructs runtime to take core dump on failure (link). When application container crashed, memory dump written to /tmp folder.

kolesnikovae commented 2 years ago

Thank you for reporting the issue @ArtemTrofimushkin!

Unfortunately, we haven't implemented support for .NET 6 yet. I'll see if we can do this quickly.

ArtemTrofimushkin commented 2 years ago

Thanks! Also, some more details, that I discovered (maybe it will be helpful) Pyroscope (dotnet spy) uses this library for communicating with .net profiler, and I tried to make simple repro based on examples directory (because I think, that problem may be in .net 6.0 profiling changes). But it works fine, without crashes. So, I think, may be issue somewhere in pyroscope repo

kolesnikovae commented 2 years ago

Oh, good to know - thank you for sharing, that's really helpful!

I guess the example program may not trigger the crash because it collects the trace data just once and exits. Perhaps the dump can shed light on what is going on with the runtime and what causes app to crash.

On a side note, what do you think about dotnet monitor? Would it work for you better if we pulled the data from pods remotely with k8s auto discovery and stuff, as we do for Go?

ArtemTrofimushkin commented 2 years ago

I will try to collect dumps from our apps and share (approx at saturday or sunday, now we a little busy because of several geopolitical events)

We also profile several go apps with k8s auto discovery, it's just works and works perfect. Really helpful!

We tried a little dotnet-monitor with several apps, it's interesting project It will be cool to use it with pyroscope. We can test this feature, if it will be available

abeaumont commented 2 years ago

In case it helps, I have just added a dotnet benchmark to our agent benchmarks with .Net 6 and it doesn't reproduce the problem (maybe we should run it longer?): https://github.com/pyroscope-io/agent-benchmarks/pull/26

saber-wang commented 2 years ago

@abeaumont net6 is not supported at present, is it? Have a plan to support it?😊

yihango commented 2 years ago

I have the same question, when can support. NET6

Rperry2174 commented 2 years ago

hi @saber-wang , @staneee let us take a look into this and get back to you... It's not currently on the roadmap but we will look into what is needed to support it and see how we can prioritize it from there

yihango commented 2 years ago

hi @saber-wang , @staneee let us take a look into this and get back to you... It's not currently on the roadmap but we will look into what is needed to support it and see how we can prioritize it from there

Expect the end result to be good 😊

Prinsn commented 2 years ago

hi @saber-wang , @staneee let us take a look into this and get back to you... It's not currently on the roadmap but we will look into what is needed to support it and see how we can prioritize it from there

I'm not certain why it isn't on the roadmap.

Per .NET's strategy, all odd number versions are functionally live betas, and .NET 6 had its time stamped release in November, so this is coming up on a year with no roadmap to support of the current version of .NET?

Just doesn't make any sense to me. Supporting .NET 5 doesn't have a lot of value.

kolesnikovae commented 1 year ago

I didn't manage to reproduce the issue with the latest .NET 6 versions (<=6.0.401) during quite intensive testing (including the repro sample from the issue). I also should notice that we have not been reported any issues apart from this one.

Therefore, official support for .NET 6 will be announced shortly. In practice, you already can use the latest pyroscope version with .NET 6: the way .NET integration works in pyroscope is fairly version-agnostic, as we use the standard .NET diagnostic interface (diagnostic port) to communicate to the runtime and collect profiling data.

However, .NET 7 (Preview 7) is not fully supported yet as there is a known issue.

Prinsn commented 1 year ago

Noted, thank you

On Tue, Sep 13, 2022, 1:16 PM Anton Kolesnikov @.***> wrote:

I didn't manage to reproduce the issue with the latest .NET 6 versions (<=6.0.401) during quite intensive testing (including the repro sample from the issue). I also should notice that we have not been reported any issues apart from this one.

Therefore, official support for .NET 6 will be announced shortly. In practice, you already can use the latest pyroscope version with .NET 6: the way .NET integration works in pyroscope is fairly version-agnostic, as we use the standard .NET diagnostic interface (diagnostic port https://docs.microsoft.com/en-us/dotnet/core/diagnostics/diagnostic-port) to communicate to the runtime and collect profiling data.

However, .NET 7 (Preview 7) is not fully supported yet as there is a known issue https://github.com/pyroscope-io/pyroscope/issues/1493.

— Reply to this email directly, view it on GitHub https://github.com/pyroscope-io/pyroscope/issues/888#issuecomment-1245709726, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYR7G4DLMMOTG24VGXTGPTV6CZIDANCNFSM5PFESKZA . You are receiving this because you commented.Message ID: @.***>

Rperry2174 commented 1 year ago

closing as this seems to be working correctly. Please let us know if you run into any other issues