Closed mattdurham closed 1 year ago
I'm not against this, but I'm not sure if it's possible, so it needs investigation.
Last I checked, Kubelets have an undocumented API, which implies (to me) that there's no guarantees that endpoints will work between releases, or even return the same data.
It is undocumented but looks stableish, so there are real concerns there. https://github.com/cyberark/kubeletctl seems to have wrappers around the api and likely can be reviewed to see how unstable it actually is.
I propose we have a component that mirrors the kubernetes api that allows grabbing that data,
What data do we want to grab from Kubelets? I think the only thing they'll have available on them is a list of running pods.
Pods are the most interesting data and the best first case, though https://github.com/cyberark/kubeletctl/blob/master/API_TABLE.md lists other things that we might find value in.
Thanks, that table is helpful.
I did some investigation and found the following:
/pods
endpoint returns the list of pods (also listed in the API table link above). This is probably where we want to start; not much else looks immediately useful.
loki.source.kubernetes
and loki.source.podlogs
to talk directly to Kubelets for logs too, rather than using the proxy from the API server like they do today. /pods
API endpoint. We would have to do polling to get changes over time. Things I still don't know:
Minus the open questions above, it looks like this component is theoretically possible.
datadog-agent
implements similar functionality of querying the Kubelet API in pkg/util/kubernetes/kubelet/kubelet_client.go. Perhaps something can be learned from this use case
Not directly related to this issue, but another potentially interesting way to avoid incurring extra load to the K8S API is based on clustering and the component scheduling idea outlined in #3151.
So a subset of the cluster nodes would schedule and run a 'distributor' component that is responsible a) for running the Prometheus service discovery by reaching out to the K8S API and b) partitioning the information per node. The rest of the peers could look up which node runs that 'distributor' component and retrieve the information about resources running on the same node as them.
Hey, this is something we'd find very useful! In our environments we would prefer a model where the grafana-agent daemonset scrapes pods running on the localhost, using the existing kubernetes discovery method with selectors on the daemonset would put significant load on the API server. To get the ball rolling on this/spark discussion I opened a draft implementation for a new discovery.kubelet
feature largely inspired by the Pod discoverer in prometheus in https://github.com/grafana/agent/pull/4255
Since #4255 is merged, I'm going to close this as completed :)
Background
Hitting the API constantly can put significant strains on the Kubeneretes API when the local kubelet instance already has much of the data needed.
Proposal
I propose we have a component that mirrors the kubernetes api that allows grabbing that data, there are some concerns over the stability of the API but we can likely limit or make rails on what it works with and doesnt.