istio / ztunnel

The `ztunnel` component of ambient mesh
Apache License 2.0
308 stars 101 forks source link

Question: any consideration for HA(high availability) of ztunnel? #40

Open KfreeZ opened 2 years ago

KfreeZ commented 2 years ago

since ztunnel is running in standalone mode per node, any plan to support HA for it?

howardjohn commented 2 years ago

In theory we have the ability to do this (or plan to) since upgrades will probably involve surging. I hadn't expected demand for this long term though. Are there other examples of HA daemonsets?

KfreeZ commented 2 years ago

In theory we have the ability to do this (or plan to) since upgrades will probably involve surging. I hadn't expected demand for this long term though. Are there other examples of HA daemonsets?

I have two options in my mind:

  1. the hot standby mode like the telco routers, two instance share the same IP and mac address.
  2. two active instances running at the same time, dynamically load balancing the traffic to two instances.
wangfakang commented 1 year ago

hello @howardjohn have any update about ztunnel HA ? such as hot restart. thanks.

hzxuzhonghu commented 1 year ago

This is not considered ATM, because ztunnel is now very simplified and only process L4, we are confident it can not crash.

In the long run, we may need to consider upgrade. During this procedure, we should guarantee at least one ztunnel is serving

wangfakang commented 1 year ago

This is not considered ATM, because ztunnel is now very simplified and only process L4, we are confident it can not crash.

In the long run, we may need to consider upgrade. During this procedure, we should guarantee at least one ztunnel is serving

Thank you for your reply. In the upgrade scenario, how can the existing L4 connections on the old ztunnel exit gracefully?

hzxuzhonghu commented 1 year ago

It doesnot, and i have read somewhere that in mosn you implemented connection unaffected during upgrade. Not sure that mechanism can be applied here?