Add a component that instead of using the Kubernetes API allows utilizing the local API

mattdurham commented 1 year ago

Background

Hitting the API constantly can put significant strains on the Kubeneretes API when the local kubelet instance already has much of the data needed.

Proposal

I propose we have a component that mirrors the kubernetes api that allows grabbing that data, there are some concerns over the stability of the API but we can likely limit or make rails on what it works with and doesnt.

rfratto commented 1 year ago

I'm not against this, but I'm not sure if it's possible, so it needs investigation.

Last I checked, Kubelets have an undocumented API, which implies (to me) that there's no guarantees that endpoints will work between releases, or even return the same data.

mattdurham commented 1 year ago

It is undocumented but looks stableish, so there are real concerns there. https://github.com/cyberark/kubeletctl seems to have wrappers around the api and likely can be reviewed to see how unstable it actually is.

rfratto commented 1 year ago

I propose we have a component that mirrors the kubernetes api that allows grabbing that data,

What data do we want to grab from Kubelets? I think the only thing they'll have available on them is a list of running pods.

mattdurham commented 1 year ago

Pods are the most interesting data and the best first case, though https://github.com/cyberark/kubeletctl/blob/master/API_TABLE.md lists other things that we might find value in.

rfratto commented 1 year ago

Thanks, that table is helpful.

I did some investigation and found the following:

Code for all the current API endpoints from the server is located in pkg/kubelet/server/server.go.
The /pods endpoint returns the list of pods (also listed in the API table link above). This is probably where we want to start; not much else looks immediately useful.
- However, in the future, it might make sense for loki.source.kubernetes and loki.source.podlogs to talk directly to Kubelets for logs too, rather than using the proxy from the API server like they do today.
There is no "watch" capabilities for the /pods API endpoint. We would have to do polling to get changes over time.

Things I still don't know:

How can a container get access to the Kubelet API? I know you can proxy requests to it from the API server, but that's what we're trying to avoid. Is there a way containers can talk directly to their hosting Kubelets?
When talking directly to a Kubelet, it's not clear what authentication we would need to use.

Minus the open questions above, it looks like this component is theoretically possible.

mikemykhaylov commented 1 year ago

datadog-agent implements similar functionality of querying the Kubelet API in pkg/util/kubernetes/kubelet/kubelet_client.go. Perhaps something can be learned from this use case

tpaschalis commented 1 year ago

Not directly related to this issue, but another potentially interesting way to avoid incurring extra load to the K8S API is based on clustering and the component scheduling idea outlined in #3151.

So a subset of the cluster nodes would schedule and run a 'distributor' component that is responsible a) for running the Prometheus service discovery by reaching out to the K8S API and b) partitioning the information per node. The rest of the peers could look up which node runs that 'distributor' component and retrieve the information about resources running on the same node as them.

gcampbell12 commented 1 year ago

Hey, this is something we'd find very useful! In our environments we would prefer a model where the grafana-agent daemonset scrapes pods running on the localhost, using the existing kubernetes discovery method with selectors on the daemonset would put significant load on the API server. To get the ball rolling on this/spark discussion I opened a draft implementation for a new discovery.kubelet feature largely inspired by the Pod discoverer in prometheus in https://github.com/grafana/agent/pull/4255

rfratto commented 1 year ago

Since #4255 is merged, I'm going to close this as completed :)

grafana / agent

Add a component that instead of using the Kubernetes API allows utilizing the local API #4233

Background

Proposal