open-telemetry / opentelemetry-collector-releases

OpenTelemetry Collector Official Releases
https://opentelemetry.io
Apache License 2.0
250 stars 160 forks source link

Publishing remotely debuggable collector docker images #481

Open swar8080 opened 8 months ago

swar8080 commented 8 months ago

Hello, I'm opening this issue to see if there's interest in publishing collector images that allow attaching a remote debugger. They can be helpful for troubleshooting environment-specific issues

I've hacked together some images that allow this at https://github.com/swar8080/remote-debug-otel-collector. I'll mention this repo in the otel slack channel and link this issue in the README. That could help gauge the community's interest in making "official" debuggable images in opentelemetry-collector-releases

TylerHelmuth commented 8 months ago

For security reasons we have no plans to publish images built with anything other than scratch, with only the bare minimum needed to run a Collector.

If you need to remote into a running collector I think the route you've taken to build your own image is the correct approach.

swar8080 commented 8 months ago

It's probably possible to get the image down to scratch plus just the delve binary

jpkrohling commented 8 months ago

Jaeger has something like this, and I think we could incorporate building new images after the release is made. From what I remember, apart from the initial implementation, it didn't require maintenance.

https://github.com/jaegertracing/jaeger/blob/main/docker/debug/Dockerfile

mx-psi commented 8 months ago

Kubernetes supports using ephemeral containers for debugging https://kubernetes.io/docs/tasks/debug/debug-application/debug-running-pod/#ephemeral-container; I think most of our users can debug the official images without us having to introduce new images.

jpkrohling commented 8 months ago

That's useful, but from what I remember, there are two advantages to debug images: there's delve, allowing people to plug a debugger and attach the source code, so that people can add breakpoints and see why things are breaking. The second advantage is having access to the same file system, which might be useful to check whether the right files are being loaded.