DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.8k stars 1.18k forks source link

[BUG] Alpine java apps do not work without specifying dotnet -musl image #22575

Closed miskr-instructure closed 7 months ago

miskr-instructure commented 7 months ago

Agent Environment version: 7.50.3 os: linux (container os) cloud: aws eks orchestrator: kubernetes

Describe what happened:

After enabling mutating webhooks and following the docs (which do not say anything special is needed for java based on musl), the pod gets stuck in status

    State:         Waiting
      Reason:      CrashLoopBackOff
      Last State:    Terminated
      Reason:      Error
      Message:     Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /datadog-lib/continuousprofiler/Datadog.Linux.ApiWrapper.x64.so)
      Exit Code:    127

The documentation does not say adding admission.datadoghq.com/dotnet-lib.version: latest-musl is mandatory for every Alpine image, but it seems to be, in practice.

Describe what you expected: It should be possible to instrument java (and really, any non-dotnet) apps based on Alpine images without needing to specify the dotnet *-musl instrumentation image. We do not have dotnet anywhere in our tech stack.

Imo. the best option would be having the ability to globally opt-out (in chart/operator of datadog-agent, or via some command line arg/annotation) from adding dotnet instrumentation at all. This should probably apply to other types of instrumentation, but dotnet is particularly annoying due to musl version being separate image.

Steps to reproduce the issue:

Run a java app based on Alpine Linux with datadog admission + instrumentation enabled. It will crash with

Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /datadog-lib/continuousprofiler/Datadog.Linux.ApiWrapper.x64.so)

...even though there is no need for that shared library (used by dotnet instrumentation only) in a java app.

Additional environment details (Operating System, Cloud provider, etc):

Adding annotation admission.datadoghq.com/dotnet-lib.version: latest-musl is a workaround, but it's not a good solution that we need to add it to every deployment when we do not use dotnet in the first place.

miskr-instructure commented 7 months ago

I misunderstood the docs. Single-step-instrumentation (Beta) was enabled, which always attempts to instrument all programming languages (including the problematic one - dotnet).

After turning off "Single-step-instrumentation" and manually annotating podSpecs with admission.datadoghq.com/java-lib.version: latest and label admission.datadoghq.com/enabled: 'true' only the java instrumentation is injected and things work fine.