k8sgpt-ai / k8sgpt

Giving Kubernetes Superpowers to everyone
http://k8sgpt.ai
Apache License 2.0
5.9k stars 684 forks source link

[Feature]: Constructing a Resource Graph in Kubernetes and Transmitting Information Including Subgraphs Near Problematic Nodes to LLM #1055

Open kimchaeri opened 7 months ago

kimchaeri commented 7 months ago

Checklist

Is this feature request related to a problem?

No

Problem Description

Currently, K8sGPT only provides the causes of errors and potential solutions for the respective components. Therefore, it sometimes provides inaccurate responses. However, if we construct the Kubernetes cluster as a graph and provide surrounding information of nodes where errors occur to LLM, it can provide more accurate answers. When conducting simple tests, I confirmed that providing context of the subgraph results in LLM providing more accurate answers.

Solution Description

  1. Construct Kubernetes cluster as a graph to integrate with K8s analyzer
  2. Extract subgraph near error-prone nodes to provide context to LLM
  3. Visualize subgraph near error-prone nodes for error analysis

Benefits

Potential Drawbacks

No response

Additional Information

No response

AlexsJones commented 7 months ago

This is a really interesting concept. I think it could be a powerful feature.

qdrddr commented 7 months ago

You could build a graph based on helm chart installed or using GitOps tools such as ArgoCD or FluxCD. FYI @kimchaeri

kimchaeri commented 7 months ago

You could build a graph based on helm chart installed or using GitOps tools such as ArgoCD or FluxCD. FYI @kimchaeri

oh, thank you for sharing that with me!

vedant-8680 commented 7 months ago

https://github.com/benc-uk/kubeview Try using this tool for generating the graph. It builds something similar to what ArgoCD or FluxCD does for Kubernetes resources. @kimchaeri @qdrddr

arbreezy commented 7 months ago

Extract subgraph near error-prone nodes to provide context to LLM

I think this looks like a clever approach.

I believe we also want to distinguish ownership from selectors and labels when we build a graph or in general when we create relationships between K8s resources.

Ownership by leveraging metadata.ownerReferences is a good start but I think labels can build wider relationships in terms of workloads so we can contextualize the errors that we pass to LLMs

e.g a workload X may consist of an Ingress - Service - deployment - CronJob and K8sGPT would generate error messages specific to this workload

Construct Kubernetes cluster as a graph to integrate with K8s analyzer

I think an integration with another OSS tool that provides this capability would be a great start.

miguelvr commented 6 months ago

I was about to open a similar issue, and found this one.

It makes perfect sense to take into account resource ownership by leveraging a tool like https://github.com/ahmetb/kubectl-tree

Later on, we can include common label or annotations like the ones in argocd or helm to group applications.