distribworks / dkron-helm

Helm chart to install Dkron and other associated components.
Apache License 2.0
4 stars 13 forks source link

Chart refactor #3

Closed FedeBev closed 1 year ago

FedeBev commented 3 years ago

Since there's still an open issue in the main repo and I've the feeling this chart has been dropped, I've decided to rework it completely.

The main differences are:

Known issues:

Let me know what are your thoughts, I'm open to feedback available to make any changes

vcastellm commented 2 years ago

Sorry for the mega-delay here @FedeBev, thank you very much for the work! I had not prioritized the K8s work until now, going to test this and merge if it works.

FedeBev commented 2 years ago

What kind of problem are you facing @victorcoder? I just tried a rolling update from version v3.1.8 to v3.1.10 and everything went good except for a slight leader election delay.

Do you mind sharing your values and some details about your cluster?

I've just fixed the missing newlines and cleaned up the sample values file

vcastellm commented 2 years ago

I just do a helm install helm upgrade and the rolling is too fast for the leader election to happen correctly, when done rolling there's no leader in the cluster, all nodes ends up as followers.

Are you tweeking some other options?

FedeBev commented 2 years ago

No additional tweek on my configuration.

This also happens to me, but after a few seconds after the end of the rollout a new leader is elected again and everything goes fine.

We can try to tune the rollout strategy, but taking into account the issue about the kubernetes discovery, I don't think we can achieve something better with helm. An operator would solve the issue, but it's a huge effort.

vcastellm commented 2 years ago

@FedeBev what about a liveness, readiness check?

FedeBev commented 2 years ago

@vcastellm in the issue about the kubernetes discovery I've mentioned before, you can find why I wasn't able to use those.

I don't see any other possible way right now. I guess we should change something about how dkron starts or request a PR to the kubernetes discovery library. In my opinion, the point is that dkron should be able to discover itself when the pod is running (/health returns 200), NOT when is ready. This way the k8s discovery works and we can implement a useful /health and ready probe.

vcastellm commented 2 years ago

Understood, thanks, I will take a look on how we can improve the /health endpoint

FedeBev commented 2 years ago

Keep me posted, I'd be glad to help. Dkron could become a core component for a project I'm working on however our platform relies entirely on K8s and this is a big issue

Espina2 commented 2 years ago

I'm also very interested in this fix. @vcastellm @FedeBev anything that I can do to make this happen? Also, is this not better than the currently available chart?

FedeBev commented 2 years ago

@Espina2 the chart is working on my cluster in a dev environment, but it's still very young and there's still a long way to go before being production ready. There's an issue with the k8s discovery that makes the dkron cluster unavailable due to master election during the rolling upgrade.

omarzouk commented 1 year ago

hello! also highly interested in this. Has anyone come up with a way to work around the discovery issue? any thoughts on how to proceed?

vcastellm commented 1 year ago

Merged this PR and added some changes, I'll take over from here.