kumahq / kuma

🐻 The multi-zone service mesh for containers, Kubernetes and VMs. Built with Envoy. CNCF Sandbox Project.
https://kuma.io/install
Apache License 2.0
3.65k stars 334 forks source link

Rethink the status of a data plane proxy #2346

Open bdecoste opened 3 years ago

bdecoste commented 3 years ago

Summary

Setting universal.dataplaneCleanupAge to a lower value doesn't cause an Offline Dataplane to be removed

Steps To Reproduce

  1. Set up a Universal Multizone mesh
  2. Set universal.dataplaneCleanupAge=0h15m0s on both the Global and Zone CPs
  3. Start a DPP pointed to a nonexistent service with a probe. This will cause the DPP to be Offline.
  4. Wait >15m
  5. You will not see the log line here: https://github.com/kumahq/kuma/blob/master/pkg/gc/collector.go#L70
  6. The Dataplane will not be removed
lobkovilya commented 3 years ago

There are 2 cases when DPP is shown up as Offline:

  1. All inbounds are offline, but kuma-dp is connected to kuma-cp (your case)
  2. kuma-dp is disconnected from kuma-cp, inbounds in an unknown state

Collector will clean up DPPs only in the second case, in the first case we just can wait for the service to go back alive

jakubdyszkiewicz commented 3 years ago

which makes me think. Maybe in 1. case, we should use "partially degraded" status?

github-actions[bot] commented 2 years ago

This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant please comment on it promptly or attend the next triage meeting.

github-actions[bot] commented 2 years ago

This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant please comment on it promptly or attend the next triage meeting.

github-actions[bot] commented 2 years ago

This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant please comment on it promptly or attend the next triage meeting.

jakubdyszkiewicz commented 2 years ago

Triage: we need to look again how we present statuses. A couple of potential options we discussed: 1) Paritally degrated for unhealthy dp but connected 2) Create two statuses and do not mix xds connection with a health of DP

github-actions[bot] commented 1 year ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 1 year ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 1 year ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 1 year ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 10 months ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 7 months ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

github-actions[bot] commented 4 months ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.

lobkovilya commented 4 months ago

Triage: we still want to rethink the status of a DPP

github-actions[bot] commented 1 month ago

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. If you think this issue is still relevant, please comment on it or attend the next triage meeting.