open-telemetry / opentelemetry-network

eBPF Collector
https://opentelemetry.io
Apache License 2.0
296 stars 46 forks source link

Is this project complete and can I deploy it via opentelemetry-operator? #253

Closed novohool closed 8 months ago

novohool commented 8 months ago

Describe the issue you're reporting

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: simplest
spec:
  mode: daemonset
  config: |
    receivers:
      otlp:
        protocols:
           ebpf:
           dns:
yonch commented 8 months ago

Deploying is done through a helm chart, as of today still called opentelemetry-ebpf in OpenTelemetry's chart repo, see here.

Hope this answers your question, please reach out otherwise.

novohool commented 8 months ago

Deploying is done through a helm chart, as of today still called opentelemetry-ebpf in OpenTelemetry's chart repo, see here.部署是通过 Helm Chart 完成的,截至目前,OpenTelemetry 的图表存储库中仍称为 opentelemetry-ebpf,请参阅此处。

Hope this answers your question, please reach out otherwise.希望这能回答您的问题,否则请联系我们。

Ths,but it get some error in my azure k8s cluster. k8s server : microsoft azure k8s k8s version: k8s 1.23.15 node info:

OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1104-azure   containerd://1.6.18+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1104-azure   containerd://1.6.18+azure-1
yonch commented 8 months ago

Thanks for following up. Let's get this working for you.

I see multiple reconnections from that kernel collector. "METADATA COMPLETE" happens after connection, and before eBPF code is loaded. The timestamps showing only a handful of seconds between "METADATA COMPLETE" messages show that the collector hasn't yet loaded eBPF (because the collector has back-off mechanisms to prevent loading eBPF too frequently).

Reconnections could be caused by the reducer rejecting connections or restarting (perhaps due to OOM / lack of resources). Next step might be to get pod status (e.g., kubectl describe pod) and logs, if you're able to share from security/privacy perspective.