argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.91k stars 5.46k forks source link

Offer help w/ pods stuck in pending states `had taints` messages #5175

Open jsoref opened 3 years ago

jsoref commented 3 years ago

Summary

A user reported seeing:

node(s) had taints that the pod didn't tolerate

There's apparently enough information available in the system to help users: https://stackoverflow.com/questions/62991596/1-nodes-had-taints-that-the-pod-didnt-tolerate-in-kubernetes-cluster

Motivation

Proposal

I suspect that an extra tab next to Summary/Events/Logs with Recommendations or Analysis or something might be the right way to surface this.

Definitely surfacing the taint information would be good. Possibly offering a way to get to the list of nodes (this is coming as a feature soonish). Possibly showing the values for the nodes relating to the tainting that failed to help users see why the nodes rejected the pod.

jessesuen commented 3 years ago

This seems like it might be out-of-scope. Isn't the events of the pod sufficient for debugging this? I think the Application timeline view of events would help here as an alternative to this suggestion.

Relevant: https://github.com/argoproj/argo-cd/issues/4902

jsoref commented 3 years ago

I'm not sure. This wasn't actually something we hit. It was reported in Argo Slack and I considered how I'd want help from Argo to address it.

I've personally found events to be incredibly hostile. For various reasons.