argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
18.06k stars 5.52k forks source link

UI section on top for a summary of resources with recent failures in the Events section #20590

Open andrii-korotkov-verkada opened 1 month ago

andrii-korotkov-verkada commented 1 month ago

Summary

There's an empty real estate to the right of the Last Sync section. It can be used to provide an easy-to-access summary when something is broken, i.e. its Events tab has recent failures and no successes after.

Motivation

Users often want a summary of what's broken rather than an overall view with a lot of information they need to navigate through.

Proposal

Create a section with top resources with failure Events with links, that open the corresponding resources' Events page. It's somewhat similar to filtering by Degraded resources, but provides a more targeted and condensed view. It should aim to group similar failures together, e.g. instead of listing each pod with failure Events it can list a failure pattern itself and then say something like "50 pods of service A has this" with example links.

crenshaw-dev commented 1 month ago

Some prior art: https://github.com/argoproj/argo-cd/pull/9022

crenshaw-dev commented 1 month ago

This feature would help surface problems with Ingress objects. The status field (and therefore the resource health) isn't very helpful, but events are. So anything that makes failure events more prominent will be helpful.