elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.72k stars 8.14k forks source link

Inconsistent Fleet Behavior when Removing Configurations from kibana.yml for Managed Agent Policies #193407

Open eyalkraft opened 14 hours ago

eyalkraft commented 14 hours ago

Kibana version: latest (8.15)

Elasticsearch version: latest (8.15)

Describe the bug: We observed inconsistency in Fleet’s behavior when removing configurations from kibana.yml, specifically for managed Agent policies. First we configured a Fleet Host, Fleet output, and a managed Agent Policy (using the host and output) through the kibana.yml. After starting Kibana we removed these configurations from kibana.yml. After removing the Fleet Host and Fleet output configurations, the Agent policy remains, but the Fleet Host and Fleet output are deleted. This leads to the Fleet UI failing since the Agent policy references non-existing Fleet Host and output.

Steps to Reproduce:

  1. Configure Kibana using kibana.yml with a managed Agent policy, Fleet Host, and Fleet output.
  2. Remove the Fleet Host, Output and agent policy from kibana.yml.
  3. Stop and start Kibana to re-read kibana.yml
  4. Observe that the Fleet UI fails due to the missing references in the managed Agent policy.
kibana.yml example

```diff # Kibana configuration file # Enabling agentless mode xpack.cloud.serverless.project_id: 'some_fake_project_id' xpack.securitySolutionServerless.productTypes: - product_line: security product_tier: complete xpack.ml.nlp.enabled: true # Enable NLP when security and complete are selected # Fleet and agentless settings xpack.fleet.enableExperimental: - agentless xpack.fleet.packages: - name: "cloud_security_posture" version: "latest" server.versioned.versionResolution: oldest -# Remove everything below xpack.fleet.agentPolicies: - name: "Agentless" id: "agentless-policy" is_managed: true namespace: "default" fleet_server_host_id: "agentless-fleet-internal-host" data_output_id: "agentless-es-internal-output" monitoring_output_id: "agentless-es-internal-output" monitoring_enabled: ["logs", "metrics"] supports_agentless: true package_policies: [] xpack.fleet.fleetServerHosts: - id: "agentless-fleet-internal-host" name: "Agentless internal fleet server" is_default: false is_internal: true host_urls: ["https://internal-fleet-server-url/"] xpack.fleet.outputs: - id: "agentless-es-internal-output" name: "Internal agentless output" type: "elasticsearch" is_default: false is_default_monitoring: false is_internal: true hosts: ["https://internal-es-url/"] ```

Expected Behavior:

  1. Deletion of non-default fleet host and output from kibana.yml which result with their deletion in Kibana, should result with removal of references to this host and output from managed agent policies, just like it works for non-managed agent policies. They should be replaced with the default fleet host and output.
  2. The Fleet UI should not crash if an agent policy references a non-existing Fleet Host or output. Instead, this specific Agent policy should display an error, while the remaining policies should be accessible and unaffected.

Screenshots (if relevant): Image Image Image

Provide logs and/or server output (if relevant):

Details

``` [2024-09-19T13:34:13.956+03:00][INFO ][status.plugins.fleet] fleet plugin is now available: Fleet setup failed ``` ``` GET kbn:/api/fleet/agent_policies { "item": { "id": "agentless-policy", "version": "WzM4NSwxXQ==", "space_ids": [], "monitoring_enabled": [ "logs", "metrics" ], "inactivity_timeout": 1209600, "is_preconfigured": true, "data_output_id": "agentless-es-internal-output", "monitoring_output_id": "agentless-es-internal-output", "fleet_server_host_id": "agentless-fleet-internal-host", "schema_version": "1.1.1", "package_policies": [], "agents": 0, "namespace": "default", "name": "Agentless", "supports_agentless": true, "status": "active", "is_managed": true, "revision": 2, "updated_at": "2024-09-19T10:04:32.766Z", "updated_by": "system", "is_protected": false, "unprivileged_agents": 0 } } ```

Any additional context:

This bug was discovered as part of an expected migration of agentless on serverless from using preconfigured (kibana.yml) agent policy to using the agentless API.

cc @kpollich @nchaulet

elasticmachine commented 14 hours ago

Pinging @elastic/fleet (Team:Fleet)