Open jlind23 opened 3 months ago
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)
@jlind23 in your debugging, what was the workflow? As in once you found which agent is the nominated leader, what was the next steps you were trying to perform? I'm trying to figure out where this information should reside. Perhaps it can just be part of the local metadata and not something that the UI needs to show.
We wanted to ensure why one of the data exclusively collected by the leader wasn't sent hence the reason why we were looking for the leader.
@blakerouse @ycombinator we need to report it through the Elastic Agent metadata, would you happen to know how complex this will be? Once reported we can do whatever we want with it in the UI.
With the Helm Chart, do we actually use leader election?
According to @pkoutsovasilis' demo I don't think we do.
@jlind23 The leader election is its own provider and not something that has any connection to Fleet or updating the overall core state of the Elastic Agent. It will be difficult to connect the two, so its not a quick change.
It should be possible to see that the state_*
units are only running on the Elastic Agent that has leader election is on.
Hi 👋 So the Helm chart for the built-in kubernetes integration and standalone mode disables leader election as it deploys multiple agents under daemonset for node-scope metrics and containers logs, deployment for cluster-scope metrics and statefulset with kube-state-metrics container alongside the agent on to monitor kube-state-metrics, thus no need for leader election. On the contrary, the same "topology" isn't possible for managed agents through Fleet, since config now is controlled by the latter, thus in that scenario the Helm chart doesn't disable it.
Thinking out loud since an agent instance knows whether it is the leader or not, and when it won the won/lost election can't this be propagated to Kibana?!
Hi 👋 So the Helm chart for the built-in kubernetes integration and standalone mode disables leader election as it deploys multiple agents under daemonset for node-scope metrics and containers logs, deployment for cluster-scope metrics and statefulset with kube-state-metrics container alongside the agent on to monitor kube-state-metrics, thus no need for leader election. On the contrary, the same "topology" isn't possible for managed agents through Fleet, since config now is controlled by the latter, thus in that scenario the Helm chart doesn't disable it.
It is actually possible to have the Elastic Agent deployed as a deployment with kube-state-metrics and enrolled into Fleet if that Elastic Agent was enrolled into a custom policy that only enabled state_*
metrics. Another option would be to set an ENV on the container and then add a condition on the integration for that ENV, so that only the container with that ENV variable would run the state_*
metrics.
Just want to make it clear that it is possible, but the current way the integration and the manifests are designed it doesn't operate that way.
This is not a limitation of the Elastic Agent, its just a limitation on how the manifests and integrations have been designed.
Thinking out loud since an agent instance knows whether it is the leader or not, and when it won the won/lost election can't this be propagated to Kibana?!
It is absolutely possible, but not something that is directly wired into the Elastic Agent currently. If we wanted to add this information to Kibana it might be better to add extra information from other providers as well. Possible that each provider could publish a status (just like components). That would also allow say the kubernetes
provider in a non-kubernetes environment to say its not running as its unable to connect.
I think that also brings about the ability to configure providers in Fleet. Possible this just highlights that we should make providers a top-level thing in Fleet.
It is actually possible to have the Elastic Agent deployed as a deployment with kube-state-metrics and enrolled into Fleet if that Elastic Agent was enrolled into a custom policy that only enabled state* metrics. Another option would be to set an ENV on the container and then add a condition on the integration for that ENV, so that only the container with that ENV variable would run the state* metrics.
yep I have done such an enrollment so it is possible; however somebody can enable other metrics in the integration which might results in undesired effects and there is no way to limit that at least as far as I can tell
Just want to make it clear that it is possible, but the current way the integration and the manifests are designed it doesn't operate that way.
yep 100% agree, the reason that made us take that decision with the Helm chart (not disabling leader election for managed mode) wasn't a limitation of Agent but rather how an integration, at least as of now, gets applied holistically
It is absolutely possible, but not something that is directly wired into the Elastic Agent currently. If we wanted to add this information to Kibana it might be better to add extra information from other providers as well. Possible that each provider could publish a status (just like components). That would also allow say the kubernetes provider in a non-kubernetes environment to say its not running as its unable to connect.
I think that also brings about the ability to configure providers in Fleet. Possible this just highlights that we should make providers a top-level thing in Fleet.
yep being able to configure providers in Fleet and expose them like components with a status does sound like a good addition to explore that could be helpful
It is absolutely possible, but not something that is directly wired into the Elastic Agent currently. If we wanted to add this information to Kibana it might be better to add extra information from other providers as well. Possible that each provider could publish a status (just like components). That would also allow say the kubernetes provider in a non-kubernetes environment to say its not running as its unable to connect. I think that also brings about the ability to configure providers in Fleet. Possible this just highlights that we should make providers a top-level thing in Fleet.
I am leaning towards updating this issue to focus on each providers and make sure they are returning the right set of informations.
It is absolutely possible, but not something that is directly wired into the Elastic Agent currently. If we wanted to add this information to Kibana it might be better to add extra information from other providers as well. Possible that each provider could publish a status (just like components). That would also allow say the kubernetes provider in a non-kubernetes environment to say its not running as its unable to connect. I think that also brings about the ability to configure providers in Fleet. Possible this just highlights that we should make providers a top-level thing in Fleet.
I am leaning towards updating this issue to focus on each providers and make sure they are returning the right set of informations.
This would actually be more inline with OTel as well, as each extension can also report a status. This alignment will help the transition over time.
While working on some kubernetes issues we were stuck trying to figure who the leaders were. As of today, the only option is to run the following command:
In order to ease debugging it would be great to bubble up this information in Kibana UI somewhere in order to know:
This brought a global discussion of what are the information each providers should return and make available:
@nimarezainia @strawgate happy to get your thoughts on this.
cc @ycombinator @blakerouse as you recently worked on similar cases.