Is this project complete and can I deploy it via opentelemetry-operator?

novohool commented 8 months ago

Describe the issue you're reporting

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: simplest
spec:
  mode: daemonset
  config: |
    receivers:
      otlp:
        protocols:
           ebpf:
           dns:

yonch commented 8 months ago

Deploying is done through a helm chart, as of today still called opentelemetry-ebpf in OpenTelemetry's chart repo, see here.

Hope this answers your question, please reach out otherwise.

novohool commented 8 months ago

Deploying is done through a helm chart, as of today still called opentelemetry-ebpf in OpenTelemetry's chart repo, see here.部署是通过 Helm Chart 完成的，截至目前，OpenTelemetry 的图表存储库中仍称为 opentelemetry-ebpf，请参阅此处。

Hope this answers your question, please reach out otherwise.希望这能回答您的问题，否则请联系我们。

Ths,but it get some error in my azure k8s cluster. k8s server : microsoft azure k8s k8s version: k8s 1.23.15 node info:

OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1117-azure   containerd://1.7.1+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1104-azure   containerd://1.6.18+azure-1
Ubuntu 18.04.6 LTS   5.4.0-1104-azure   containerd://1.6.18+azure-1

reducer log:

+ [[ ! -e ./debug-info.conf ]]

+ install_dir=/srv

+ reducer=/srv/opentelemetry-ebpf-reducer

+ data_dir=/var/run/ebpf_net

+ dump_dir=/var/run/ebpf_net/dump

+ mkdir -p /var/run/ebpf_net /var/run/ebpf_net/dump

+ '[' -n '' ']'

+ '[' -n '' ']'

+ exec /srv/opentelemetry-ebpf-reducer --port=7000 --log-console --no-log-file --warning --enable-aws-enrichment --disable-prometheus-metrics --enable-otlp-grpc-metrics --otlp-grpc-metrics-host= --otlp-grpc-metrics-port=4317 --num-ingest-shards=1 --num-matching-shards=1 --num-aggregation-shards=1

2024-03-01 04:58:54.724009+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    unknown (unknown)

Kernel:                unknown

CPU Cores:             4

Hostname:              opentelemetry-ebpf-k8s-collector-56ff649897-qj2tn

Collector:             k8s

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    opentelemetry-ebpf-k8s-collector-56ff649897-qj2tn

Instance:              (unknown)

Agent:                 9320984286945639320

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1104-azure

CPU Cores:             4

Hostname:              aks-azure01pool-95027166-vmss00000L

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-azure01pool-95027166-vmss00000L

Instance:              (unknown)

Agent:                 2942076477995197290

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1117-azure

CPU Cores:             4

Hostname:              aks-agentpool-28398617-vmss000003

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-agentpool-28398617-vmss000003

Instance:              (unknown)

Agent:                 9606353904330390857

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1117-azure

CPU Cores:             4

Hostname:              aks-agentpool-28398617-vmss000009

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-agentpool-28398617-vmss000009

Instance:              (unknown)

Agent:                 12733221698824259056

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1117-azure

CPU Cores:             4

Hostname:              aks-agentpool-28398617-vmss0000GC

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-agentpool-28398617-vmss0000GC

Instance:              (unknown)

Agent:                 10063254994491465066

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

2024-03-01 04:59:04.723788+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1104-azure

CPU Cores:             4

Hostname:              aks-azure01pool-95027166-vmss00000M

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-azure01pool-95027166-vmss00000M

Instance:              (unknown)

Agent:                 535479050565239822

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1117-azure

CPU Cores:             4

Hostname:              aks-agentpool-28398617-vmss00000A

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-agentpool-28398617-vmss00000A

Instance:              (unknown)

Agent:                 4049481034577671629

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

--------METADATA COMPLETE---------

Agent info:

Version:               0.10.0

OS:                    Linux (debian)

Kernel:                5.4.0-1117-azure

CPU Cores:             4

Hostname:              aks-agentpool-28398617-vmss0000GA

Collector:             kernel

Kernel Headers Source: pre_installed

Entrypoint Error:      

Role:                  (unknown)

AZ:                    (unknown)

Id:                    aks-agentpool-28398617-vmss0000GA

Instance:              (unknown)

Agent:                 13081100569443093232

Overrides:

    namespace:   

    cluster:     

    service:     

    host:        

    zone:        

IPs:

Metadata Report Complete.

2024-03-01 04:59:14.723815+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 04:59:24.724524+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 04:59:34.725255+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 04:59:44.724493+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 04:59:54.725177+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 05:00:04.725604+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 05:00:14.725994+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 05:00:24.729087+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

2024-03-01 05:00:34.736574+00:00 error [p:1 t:14] Logging core failed to publish internal metrics writer stats

yonch commented 8 months ago

Thanks for following up. Let's get this working for you.

I see multiple reconnections from that kernel collector. "METADATA COMPLETE" happens after connection, and before eBPF code is loaded. The timestamps showing only a handful of seconds between "METADATA COMPLETE" messages show that the collector hasn't yet loaded eBPF (because the collector has back-off mechanisms to prevent loading eBPF too frequently).

Reconnections could be caused by the reducer rejecting connections or restarting (perhaps due to OOM / lack of resources). Next step might be to get pod status (e.g., kubectl describe pod) and logs, if you're able to share from security/privacy perspective.

open-telemetry / opentelemetry-network

Is this project complete and can I deploy it via opentelemetry-operator? #253

Describe the issue you're reporting