lyft / flinkk8soperator

Kubernetes operator that provides control plane for managing Apache Flink applications
Apache License 2.0
563 stars 159 forks source link

Flink Native HA on Kubernetes is not supported #243

Open borah-hemanga opened 2 years ago

borah-hemanga commented 2 years ago

I tried out the native HA on Kubernetes using the operator.

Here is the general synopsis:

The deployment (update) of an existing application goes through the following:

Has anyone been successful in using Native K8s HA with Flink with this FlinkK8sOperator?

nikolasten commented 2 years ago

You will need to change kubernetes.cluster-id config every time you want to deploy a flink app (increment it or take current timestamp) on any FlinkApplication config change. That way when operator starts upgrading and new cluster starts up, it wont try to behave as failover of existing cluster you are running.

I think for operator to support scenario of same kubernetes.cluster-id would need to first shutdown the job that is already running and stop the cluster. And then start the new cluster and deploy the app. Currently its trying to minimize the downtime with having both clusters running during upgrade. Would be nice to have that mode too

anandswaminathan commented 2 years ago

@nikolasten Is it only kubernetes.cluster-id?

anandswaminathan commented 2 years ago

It's here https://github.com/lyft/flinkk8soperator/blob/6264b5a2badba62500a5a7e7f1366493a62fa618/pkg/controller/flink/container_utils.go#L213

nikolasten commented 2 years ago

This is config option for zookeeper only, and not for kuberenetes ha. We did this in our fork to enable it and to make sure its different every time we deploy or upgrade the app. https://github.com/bluelabs-eu/flinkk8soperator/commit/fa64278343aab41a6815343665a342944ccc9510#diff-0e21f32f488d8c4a8aeb58de476274825e4004216515b5bcbcbe0045efe08b00R215-R218

This pr here https://github.com/lyft/flinkk8soperator/pull/170 address the changing of cluster id every time. But it does not add config option for kuberenetes based ha mode.