cilium / hubble-ui

Observability & Troubleshooting for Kubernetes Services
https://www.cilium.io
Apache License 2.0
380 stars 61 forks source link

Data streams are reconnecting errors when running hubble-ui with more than one replica #833

Open rsavage-nozominetworks opened 4 months ago

rsavage-nozominetworks commented 4 months ago

Cilium Version: 1.15.3 K8S Version: 1.29

I recently made an interesting discovery while investigating recurring "Data streams are reconnecting..." errors within the hubble-ui. In an effort to identify the underlying issue, I adjusted the replica count of the hubble-ui from '2' to '1'. Remarkably, this change resulted in the cessation of the aforementioned errors. Upon reverting the replica count to '2', the errors reemerged.

Further analysis involved examining the hubble-ui logs while operating with '2' replicas. The logs revealed session activity from both Pods, indicating that both replicas were concurrently processing client requests. This concurrent processing suggests a potential conflict in managing service maps data effectively (just my guess).

While reducing the replica count to '1' effectively resolves the error, this workaround compromises high availability (HA), making it a suboptimal solution.

lukastopiarz commented 3 months ago

The same issue is in our environment. Cilium Version: 1.14.9 K8S Version: 1.25 Hubble images: 0.13.0

dhedberg commented 3 months ago

We've been having the same issue, and I'm pretty sure it started when we upgraded to cilium 1.4.7 and hubble 0.13.

I just tried scaling down hubble-ui from 2 to 1 replicas, and that seems to fix it for us too.

Currently running cilium 1.14.9 and hubble-ui v0.13.0 on kubernetes 1.29.3.

igor-nikiforov commented 3 months ago

We have the same issue. Scaling down replica from 2 to 1 solves issue.

Kubernetes 1.25 Cilium 1.15.4