Closed cmacknz closed 9 months ago
Pinging @elastic/elastic-agent (Team:Elastic-Agent)
Would it be possible for us to do this via an environment var instead of mounting a config?
The most specific change we could make would be to introduce a variable listing the providers to disable, since they are all enabled by default. Something like ELASTIC_AGENT_DISABLE_PROVIDERS
that takes a comma separated list of providers to disable.
That would only fix this for providers and wouldn't cover any other cases where we'd want to modify the configuration (the initial logging level for one example). We could do both what is described in this issue and add an environment variable as a convenience.
Something like https://github.com/elastic/elastic-agent/issues/3609 to only run providers that exist in the policy sounds nice, but integrations that use leader election would then have to support disabling it in the inputs they use.
The ability to disable providers is important for our team and also another use-case we have is to enable traces agent.monitoring.traces
to ship the agent metrics to custom APM instances.
Thank you @cmacknz
cc @eyalkraft
What we could do is change this to only attempt to replace the file if it doesn't already contain fleet.enabled: true (or any other key that isn't commented out in our default fleet configuration).
This would allow overriding the initial contents of the elastic-agent.yml contents in a container in general, regardless of if those settings are available in Fleet.
@cmacknz While this will work for now, I'm not sure it's a sufficient solution in our case.
The enabled by default nature of providers could be problematic for agentless. Currently speaking, we don't want/need any provider enabled, and we don't want any future provider to be implicitly enabled.
Trying to maintain a comprehensive up to date list of the providers in order to disable them like we do here is less than ideal (funny enough, @olegsu and I just noticed missing the env
provider that you folks recommended we disable (issue)).
What are your thoughts about a configuration option to change this default behavior of the providers?
providers_default_disable:
which is false
when not specified, ending up with the current enabled by default behavior, And when true
will only activate providers which are explicitly configured.
What are your thoughts about a configuration option to change this default behavior of the providers? providers_default_disable: which is false when not specified, ending up with the current enabled by default behavior, And when true will only activate providers which are explicitly configured.
Something like this makes sense to address the maintenance concern you are raising. The solution in https://github.com/elastic/elastic-agent/issues/3609 is overall nicer in that it gets rid of enabling and disabling entirely, but it would be significantly more work.
Currently speaking, we don't want/need any provider enabled, and we don't want any future provider to be implicitly enabled.
Taking this into account, https://github.com/elastic/elastic-agent/issues/3609 wouldn't work because it wouldn't stop an integration from accidentally enabling a provider by referencing data it populates in the policy.
So adding a flag to unconditionally disable providers makes sense to me as the best path forward.
Something like #3609 to only run providers that exist in the policy sounds nice, but integrations that use leader election would then have to support disabling it in the inputs they use.
@cmacknz Can you explain what you meant here more? If we solved #3609 would it break leader election?
I made a bit of an assumption that in an agentless deployment we want things like the host provider disabled permanently with no way to turn it on, because it will leak information about the machine hosting the agent. This possibly has security implications, and even if it didn't is leaking implementation details back to the user.
Recently there have been several situations where users have needed to disable the leader election provider. Providers in agent cannot be configured in Fleet today, but even if they were they require a restart of the agent to take effect, which also isn't supported through Fleet. The providers are initialized when the composable controller is created at startup:
https://github.com/elastic/elastic-agent/blob/9d25f79b159744801daefca36ae227674e28d920/internal/pkg/composable/controller.go#L52-L71
It is possible to disable providers by editing the elastic-agent.yml file read by the agent container when it starts, which on Kubernetes is most easily accomplished by mounting the file as a ConfigMap. Essentially the process is:
A simplified example follows:
The problem we hit is that we try to unconditionally replace the local agent configuration (in this case a bind mounted ConfigMap) with our default Fleet configuration. This is just an empty configuration that sets
fleet.enabled: true
:We unconditionally try to rotate the file during enrollment, which happens every time the agent container starts when the state path isn't persisted outside of the container file system. This happens in: https://github.com/elastic/elastic-agent/blob/9d25f79b159744801daefca36ae227674e28d920/internal/pkg/agent/cmd/enroll_cmd.go#L173-L178
The code path that does this continues with the
SafeFileRotate
rotate call, which is what fails for the bind mounted ConfigMap: https://github.com/elastic/elastic-agent/blob/9d25f79b159744801daefca36ae227674e28d920/internal/pkg/agent/storage/replace_store.go#L59-L75)What we could do is change this to only attempt to replace the file if it doesn't already contain
fleet.enabled: true
(or any other key that isn't commented out in our default fleet configuration).This would allow overriding the initial contents of the elastic-agent.yml contents in a container in general, regardless of if those settings are available in Fleet.
In the case of providers, even if we did allow disabling leaderelection in the UI it would still be enabled until the agent receives the first policy change from Fleet, so disabling it in the initial configuration like this is likely the preferred route.