cilium / hubble-otel

Hubble adaptor for OpenTelemetry
Other
69 stars 22 forks source link

prefer node ip for commnunicating with cilium , rather than DNS #86

Closed weizhoublue closed 2 years ago

weizhoublue commented 2 years ago

Signed-off-by: weizhou.lan@daocloud.io weizhou.lan@daocloud.io

prefer node ip for commnunicating with cilium , rather than DNS

when I deploy hubble-otel , I encounter the following issue reported by the hubble otel collector DNS failed to resolve Node hostname . It much make sense to use nodeIp for collector communicating directly with cilium-agent for most cases

# kubectl logs -n kube-system -l app.kubernetes.io/name=otelcol-hubble-collector
2022-01-27T13:22:38.295Z    info    v3@v3.2103.1/logger.go:46   Discard stats nextEmptySlot: 0
    {"kind": "receiver", "name": "hubble"}
2022-01-27T13:22:38.295Z    info    v3@v3.2103.1/logger.go:46   Set nextTxnTs to 0  {"kind": "receiver", "name": "hubble"}
2022-01-27T13:22:38.298Z    info    service/telemetry.go:116    Serving Prometheus metrics  {"address": ":8888", "level": "basic", "service.instance.id": "cae4e6c1-bbca-41f6-93b6-7d9e0c310551", "service.version": "latest"}
2022-01-27T13:22:38.298Z    info    service/collector.go:230    Starting otelcol-hubble...  {"Version": "0.1.0", "NumCPU": 40}
2022-01-27T13:22:38.298Z    info    service/collector.go:132    Everything is ready. Begin running and processing data.
2022-01-27T13:22:39.282Z    info    jaegerexporter@v0.38.0/exporter.go:186  State of the connection with the Jaeger Collector backend   {"kind": "exporter", "name": "jaeger", "state": "READY"}
2022-01-27T13:22:42.290Z    error   receiver@v0.0.0-00010101000000-000000000000/receiver.go:94  hubble reciever error   {"kind": "receiver", "name": "hubble", "error": "GetFlows failed: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: lookup 172-81-0-20 on 172.22.0.10:53: server misbehaving\""}
github.com/cilium/hubble-otel/receiver.(*hubbleReceiver).Start.func1
    github.com/cilium/hubble-otel/receiver@v0.0.0-00010101000000-000000000000/receiver.go:94
2022-01-27T13:22:38.600Z    info    v3@v3.2103.1/logger.go:46   Discard stats nextEmptySlot: 0
    {"kind": "receiver", "name": "hubble"}
2022-01-27T13:22:38.600Z    info    v3@v3.2103.1/logger.go:46   Set nextTxnTs to 0  {"kind": "receiver", "name": "hubble"}
2022-01-27T13:22:38.606Z    info    service/telemetry.go:116    Serving Prometheus metrics  {"address": ":8888", "level": "basic", "service.instance.id": "b597c5cb-2d86-4173-a8b1-0a5fecfd3635", "service.version": "latest"}
2022-01-27T13:22:38.606Z    info    service/collector.go:230    Starting otelcol-hubble...  {"Version": "0.1.0", "NumCPU": 40}
2022-01-27T13:22:38.606Z    info    service/collector.go:132    Everything is ready. Begin running and processing data.
2022-01-27T13:22:39.590Z    info    jaegerexporter@v0.38.0/exporter.go:186  State of the connection with the Jaeger Collector backend   {"kind": "exporter", "name": "jaeger", "state": "READY"}
2022-01-27T13:22:42.593Z    error   receiver@v0.0.0-00010101000000-000000000000/receiver.go:94  hubble reciever error   {"kind": "receiver", "name": "hubble", "error": "GetFlows failed: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp: lookup 172-81-0-10 on 172.22.0.10:53: server misbehaving\""}
github.com/cilium/hubble-otel/receiver.(*hubbleReceiver).Start.func1
    github.com/cilium/hubble-otel/receiver@v0.0.0-00010101000000-000000000000/receiver.go:94

Signed-off-by: weizhou Lan weizhou.lan@daocloud.io

weizhoublue commented 2 years ago

@lizrice

errordeveloper commented 2 years ago

Thanks for this PR @weizhouBlue! I am not working on this project any more, but surely @lizrice can help you with this. The change looks good to me.

weizhoublue commented 2 years ago

@errordeveloper thanks for reply

lizrice commented 2 years ago

The permissions issue is nothing to do with this PR so I'm going ahead and merging this