odigos-io / opentelemetry-go-instrumentation

OpenTelemetry auto-instrumentation for Go applications
Apache License 2.0
289 stars 44 forks source link

use daemonset way not sidecar way #9

Open withlin opened 2 years ago

withlin commented 2 years ago

Feature request

use daemonset way not sidecar way

Use case

May be consider daemonset that it will deploy agent on nodes,which it will collect data by pid, and the use pid association with pod metadata.

what do you think? @edeNFed

edeNFed commented 2 years ago

Sounds like a really interesting feature @withlin Regarding implementation, I think we should consider changing the current logic of specifying the target binary via OTEL_TARGET_EXE to logic like:

  1. Scan all processes in /proc
  2. Detect all processes that are written in Go. we already have this code at https://github.com/keyval-dev/opentelemetry-go-instrumentation/blob/af32862e111b42be049956705d0dd27e9639fc15/pkg/process/module.go#L40
  3. Apply instrumentation for every found Go process.

I think changing the behavior to this will allow us to both remove the OTEL_TARGET_EXE config and support daemon set deployment at the same time. Basically, when deployed as a daemon set with hostPID: true /proc will simply contain more processes and the instrumentation agent does not even aware of the fact that it is currently deployed as a daemon set. What do you think?

withlin commented 2 years ago

Sounds like a really interesting feature @withlin Regarding implementation, I think we should consider changing the current logic of specifying the target binary via OTEL_TARGET_EXE to logic like:

  1. Scan all processes in /proc
  2. Detect all processes that are written in Go. we already have this code at https://github.com/keyval-dev/opentelemetry-go-instrumentation/blob/af32862e111b42be049956705d0dd27e9639fc15/pkg/process/module.go#L40
  3. Apply instrumentation for every found Go process.

I think changing the behavior to this will allow us to both remove the OTEL_TARGET_EXE config and support daemon set deployment at the same time. Basically, when deployed as a daemon set with hostPID: true /proc will simply contain more processes and the instrumentation agent does not even aware of the fact that it is currently deployed as a daemon set. What do you think?

if you scan all processes, may be have problem. for instance: i want to collect specific process,such as only collect voting service data.

hi @edeNFed , i have a idea as follow:

  1. the agent should use k8s informer way that it will watch pod resource, the pod should have annotations: such as keyval.dev/go-instrumentation: true,then the current agent get current node info,which it can fliter node in pod resource mainifest. if match it will do 2 step.
  2. the agent use docker grpc client/ containerd grpc client get pid. the command line: docker inspect -f '{{.State.Pid}}' <container id>
  3. to do golang instrument
edeNFed commented 2 years ago

That sounds really cool. I wonder if we should integrate somehow with the OpenTelemetry Operator: https://github.com/open-telemetry/opentelemetry-operator#opentelemetry-auto-instrumentation-injection

It looks like they are doing similar things, for example, annotating deployments for instrumentation. what do you think?

withlin commented 2 years ago

@edeNFed Yes, this project that sound good to me.opentelemetry-auto-instrumentation-injection is using mutating webhook intercept by annotation(inject container in pod resource). the implement as follows:

https://github.com/open-telemetry/opentelemetry-operator/blob/main/internal/webhookhandler/webhookhandler.go#L92 https://github.com/open-telemetry/opentelemetry-operator/blob/main/pkg/sidecar/podmutator.go#L57 https://github.com/open-telemetry/opentelemetry-operator/blob/main/pkg/sidecar/pod.go#L37

before you send to me for this project,i have no idea for it. use daemonset way may reduce overhead vs sidecar,that's why I recommend implementing this way.

edeNFed commented 2 years ago

@withlin After reading your comments again, I think I understand what you mean much better now. I really like this solution, sounds elegant and I can see many advantages to it (for example we could even instrument Kubernetes internal components like kubelet).

Another thing we probably need to implement as part of this is how to stop instrumenting when the annotation is deleted, It should probably delete all the attached uprobes.

I like it. Would you like to try to implement it? I can also get to it in about two weeks.

withlin commented 2 years ago

@edeNFed i am glading to implement it. but i need some time to familiar with your source code.