[Fleet] Add ability to enable and configure HTTP Monitoring

pierrehilbert commented 1 year ago

Describe the feature: When Elastic Agent is enrolled into Fleet, we can no more configure the agent.monitoring setting because it's part of the elastic-agent.yml file (that is taken into account only when we are enrolling the Agent). In the past, we were able to still configure it in the fleet.yml file but now this file is encrypted and it's no more possible.

This issue is following this SDH https://github.com/elastic/sdh-beats/issues/3168

Requirements

In the agent policy settings page, under the Agent Monitoring section

[ ] Provide a expandable "Advanced Settings" section
[ ] This section should allow the Agent Monitoring parameters to be configured ONLY if either collection of agent logs or metrics is enabled (which is by default).

Something to this effect:

For reference the full configuration options are:

elasticmachine commented 1 year ago

Pinging @elastic/fleet (Team:Fleet)

jen-huang commented 1 year ago

cc @nimarezainia - would like to have your input on the priority of exposing advanced agent.monitoring settings through the UI. Today we just have a simple toggle:

This is converted to the following in agent yaml:

agent:
  monitoring:
    enabled: true
    use_output: default
    namespace: default
    logs: true
    metrics: true

pierrehilbert commented 1 year ago

To give more context, it's something problematic for APM but we have a "workaround": re-enroll the Agent into Fleet.

nimarezainia commented 1 year ago

re-enrolling the agent is never an acceptable solution. @jen-huang we should address this but not sure if it's that urgent. I'll let you place in the appropriate sprint.

jen-huang commented 1 year ago

@nimarezainia This will need design consideration to support all agent.monitoring settings. I found https://www.elastic.co/guide/en/fleet/current/elastic-agent-monitoring-configuration.html but that doesn't seem comprehensive as the original SDH reported needing to set fields like:

agent.monitoring:
  http:
    enabled: true 
    host: localhost 
    port: 6791

@pierrehilbert Are all the agent.monitoring fields documented somewhere?

pierrehilbert commented 1 year ago

I'm not aware if we have another documentation for that somewhere else. But as you mentioned, we have more fields that we can see in elastic-agent.reference.yml @nimarezainia do you know if we have something else?

nimarezainia commented 1 year ago

I'm not aware if we have another documentation for that somewhere else. But as you mentioned, we have more fields that we can see in elastic-agent.reference.yml @nimarezainia do you know if we have something else?

Sorry I am not aware of any other docs. Where could I find the full list of configurable options in code? (obviously I see host and port). We probably need to redesign that section of the settings as Jen mentioned.

fmiqbal commented 1 year ago

this is become blocking for installation.

Environment: Kubernetes VM using microk8s

I have elastic agent installed inside the k8s using this guide https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-managed-by-fleet.html#running-on-kubernetes-managed-by-fleet, it then bind to the host 6791 as seen on netstat

Now I want to add Fleet Elastic Agent in the node itself using default guide when adding agent (that refer to https://www.elastic.co/guide/en/fleet/8.7/install-fleet-managed-elastic-agent.html) , but it can't because it can't bind to 6791, and I don't think editing elastic-agent.yml does anything

May 13 18:51:19 unpad-k8s-node-0 systemd[1]: Stopped Elastic Agent is a unified agent to observe, monitor and protect your system..
May 13 18:51:19 unpad-k8s-node-0 systemd[1]: Started Elastic Agent is a unified agent to observe, monitor and protect your system..
May 13 18:51:20 unpad-k8s-node-0 elastic-agent[3289118]: Error: could not start the HTTP server for the API: listen tcp 127.0.0.1:6791: bind: address already in use
May 13 18:51:20 unpad-k8s-node-0 elastic-agent[3289118]: For help, please see our troubleshooting guide at https://www.elastic.co/guide/en/fleet/8.7/fleet-troubleshooting.html
May 13 18:51:20 unpad-k8s-node-0 systemd[1]: elastic-agent.service: Main process exited, code=exited, status=1/FAILURE
May 13 18:51:20 unpad-k8s-node-0 systemd[1]: elastic-agent.service: Failed with result 'exit-code'.

jerrac commented 12 months ago

When Elastic Agent is enrolled into Fleet, we can no more configure the agent.monitoring setting because it's part of the elastic-agent.yml file (that is taken into account only when we are enrolling the Agent). In the past, we were able to still configure it in the fleet.yml file but now this file is encrypted and it's no more possible.

Wait, so the reason I've been getting no result from modifying elastic-agent.yml is that it is no longer allowed? Even though the file itself still has that comment on top "You can update this file to configure the settings that are not supported by Fleet."?

Specifically we're trying to set up backups via Veeam and it requires the 6791 port. So I've been trying to get Agent to stop listening on that port. Is my only choice to just stop using Agent?

pierrehilbert commented 12 months ago

elastic-agent.yml is used only by a standalone Agent. When you are enrolling an Agent into Fleet, the local configuration file is merged with what you are getting from the Fleet policy and is creating fleet.enc that is now the new configuration file.

If you want your local elastic-agent.yml file to be taken into account again, you have to run the enroll command again to regenerate fleet.enc.

This is the current only way if you want to change the monitoring port. Warning: don't forget to use elastic-agent.yml.<DATE>.bak content if you want to have your previous configuration in the fleet.enc too

zez3 commented 12 months ago

@pierrehilbert

If you want your local elastic-agent.yml file to be taken into account again, you have to run the enroll command again to regenerate fleet.enc.

What did you meant by that?

I can combine some local option with some coming from Fleet or am I reading this wrong?

pierrehilbert commented 12 months ago

You can only during enrollment phase. When your Agent is enrolled, we won't parse again the elastic-agent.yaml file.

zez3 commented 12 months ago

So if I try to configure HTTP endpoint for metrics https://www.elastic.co/guide/en/beats/filebeat/current/http-endpoint.html

this should work as well ?

zez3 commented 12 months ago

Strange is that if I do:

/opt/Elastic/Agent/elastic-agent inspect

I can already see

http: enabled: true

agent:
  download:
    sourceURI: https://artifacts.elastic.co/downloads/
  features: null
  headers: null
  id: 99b69253-1d27-47b8-a5a6-c3024081e677
  logging:
    level: info
  monitoring:
    enabled: true
    http:
      buffer: null
      enabled: false
      host: ""
      port: 6791
    logs: true
    metrics: true
    namespace: ece
    use_output: default
  protection:
    enabled: false
    signing_key: mykey
    uninstall_token_hash: ""
fleet:
  access_api_key: mykey
  agent:
    id: ""
  enabled: true
  host: mydom.mytld:9243
  hosts:
  - https://mydom.mytld:9243
  protocol: http
  reporting:
    check_frequency_sec: 30
    threshold: 10000
  ssl:
    renegotiation: never
    verification_mode: ""
  timeout: 10m0s
host:
  id: d7d522adb50b4050aee658b2bbe4ebfd
http:
  enabled: true
id: ce165160-3b59-11ec-9e09-e3e9ddff6cd0
inputs:
- data_stream:
...

but that does not seem to work, I cannot see port 5066 open

Do I need re-enroll?

zez3 commented 12 months ago

Or would this filebeat HTTP endpoint for metrics need to be configured under the - data_stream: config?

lucabelluccini commented 12 months ago

The setting http.enabled.true at the root of the elastic-agent configuration is likely ignored. It is not present in https://www.elastic.co/guide/en/fleet/current/elastic-agent-reference-yaml.html

I did an elastic-agent inspect on a Fleet managed elastic Agent and I got no http... setting at the root.

In general, we recommend to not edit the configuration of Elastic Agent. The workarounds provided in this comment are not tested on any possible configuration of policies.

How to change or disable the `6791/http` (monitoring) port

In case you want to disable the 6791/http (monitoring) port, you have 2 options:

[BEFORE INSTALL] set agent.monitoring.enabled: false in the elastic-agent.yml after extracting the tar.gz, then install
[AFTER INSTALL] set agent.monitoring.enabled: false in the elastic-agent.yml at /opt/Elastic/Agent and trigger an elastic-agent restart.

In both cases, also ensure you disable Collect agent metrics in the Elastic Agent Policy assigned to the Elastic Agent in Fleet UI. The local config overrides the policy setting anyway.

In case you want to change the 6791/http (monitoring) port, you have only 2 options:

[BEFORE INSTALL] set agent.monitoring.http.port: <preferred port number> in the elastic-agent.yml after extracting the tar.gz
[AFTER INSTALL] set agent.monitoring.http.port: <preferred port number> in the elastic-agent.yml at /opt/Elastic/Agent and re-enroll the Elastic Agent. WARNING: you will likely lose the state/registry of Elastic Agent, leading to possible data loss or duplicates. The Elastic Agent will temporarily appear twice in the Fleet UI in Kibana.

Once the port setting is effective for the Elastic Agent, subsequent upgrades should preserve it.

In this case, ensure you enable Collect agent metrics in the Elastic Agent Policy assigned to the Elastic Agent in Fleet UI.

Monitoring port listens on all interfaces

It is also well known that by default the monitoring port listens on all interfaces, not just localhost. This is being tracked via https://github.com/elastic/elastic-agent/issues/2509. As a workaround, it is possible to set the following configuration and use the same strategies as changing the port detailed in the previous paragraph.

agent.monitoring:
  http:
    enabled: true 
    host: localhost 
    port: 6791

How to change the `6789/grcp` (management) port

In case you want to change the 6789/http (management) port:

[BEFORE INSTALL] set agent.grpc.port: <preferred port number> in the elastic-agent.yml. Then install/enroll.
[AFTER INSTALL] set agent.grpc.port: <preferred port number> in the elastic-agent.yml at /opt/Elastic/Agent. Then trigger an elastic-agent restart.

You can confirm the ports listening using netstat or lsof -i.

jerrac commented 12 months ago

[BEFORE INSTALL] set agent.monitoring.http.port: in the elastic-agent.yml after extracting the tar.gz Just to clarify, you mean the elastic-agent.yml in the extracted directory. Not manually creating one in /opt/Elastic/Agent before the installation occurs. Right?

Long term, edits to elastic-agent.yml for settings not covered by Kibana/Fleet really should apply. Even if set after installation. Is there an issue (I'll go look myself in a bit) covering progress on making that happen?

Is there any movement on making the topic of this issue happen? As in making Elastic Agent fully configurable through Kibana/Fleet UI?

The workaround given is not exactly stellar user interface. I'd only need to do it on maybe 10 or so vm's. I can't imagine working somewhere larger and needing to make it happen on hundreds. If it was just adding some config to elastic-agent.yml and restarting it, a small ad-hoc Ansible playbook would do. But since I have to un-enroll, edit, re-enroll, I'd need to figure out how to securely pass the enrollment token around, and how to make sure the right token for the right policy goes to the right vm. Doable, I think, but enough extra I haven't done it yet.

lucabelluccini commented 11 months ago

I think this boils down to have a richer structured Elastic Agent config deployable by Fleet Server through a policy. What I've shared at https://github.com/elastic/kibana/issues/153950#issuecomment-1713681941 is absolutely a workaround (hence the disclaimer). I think your suggestion is great and I think Fleet/Elastic Agent team will value your feedback too.

pierrehilbert commented 11 months ago

Ping @nimarezainia to ensure that this is under your radar.

nimarezainia commented 11 months ago

Long term, edits to elastic-agent.yml for settings not covered by Kibana/Fleet really should apply. Even if set after installation. Is there an issue (I'll go look myself in a bit) covering progress on making that happen?

@jerrac this would break the configuration model for Fleet managed agents. Fleet here (and its policies) are a configuration source of truth. If we allow changes to the configuration on the agent, it will quickly drift from the source of truth and end up with agents in a policy that are not behaving the same.

We will address this issue properly by adding the configurations to the policy.

nimarezainia commented 11 months ago

@amitkanfer @kpollich another use case for that advanced config conversation.

jerrac commented 11 months ago

this would break the configuration model for Fleet managed agents. Fleet here (and its policies) are a configuration source of truth. If we allow changes to the configuration on the agent, it will quickly drift from the source of truth and end up with agents in a policy that are not behaving the same.

I get that, even mostly agree with it, but the fact this issue exists makes me think I'd rather risk the configuration drift than not be able to actually use the tool at all.

That said, it sounds like there is work going on to make sure Fleet can mange the settings properly. Hopefully that will get everything under one hood and we won't end up with some settings only managed via yaml, and others via the UI. So I'll just look forward to seeing the results of that. :)

jerrac commented 5 months ago

@nimarezainia Is there any progress on making the monitoring port configurable? As well as the rest of the settings that used to be configured in the yaml file?

nimarezainia commented 5 months ago

@jerrac No I am sorry I don't have an update at this point. There are other higher priorities on the roadmap but we will get to this in due course.

jerrac commented 1 week ago

Um, so, this issue is about 15 months old. At what point will this get addressed? Is the number of people effected by this really so low that it can linger for that long? That'd explain the delay, still leave me frustrated, but it'd explain it...

nimarezainia commented 1 week ago

@jerrac yes there are higher priority issues that we are spending time on. Sounds like you were just looking for changing the monitoring ports correct? if that capability is available via a config in the agent policy:

kpollich commented 1 week ago

This is scheduled for delivery in 8.16.0.

elastic / kibana