Closed cmacknz closed 1 year ago
This was introduced in v8.6.2 https://github.com/elastic/elastic-agent/commit/ac658893f9f3e0185a89367e809a96d884eb8317
We will need a new manual QA regression test case for this that configures the agent to use a Fleet proxy server, installs the Elastic Defend integration, and ensures that the agent remains healthy.
This can be added to the proxy test suite that was created in https://github.com/elastic/kibana/issues/140533#issuecomment-1447983586
Hi @cmacknz Thank you for the update, we have added the required testcase under Fleet test suite.
Details are shared under https://github.com/elastic/kibana/issues/140533#issuecomment-1450007059
Please let us know if anything else is required from our end.
Thanks!
@amolnater-qasource can you test the following scenario:
--proxy-url
command line parameter.
Expected Result: The enrolment succeeds and the agent appears healthy.[elastic_agent.endpoint_security][error] Http.cpp:327 CURL error 28: Error [Failed to connect to <domain>.fleet.....elastic-cloud.com port 443 after 21272 ms: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected]
Hi @cmacknz
We have revalidated this on 8.7.0 BC6 Kibana cloud environment and had below observations:
Expected Result: The enrolment succeeds and the agent appears healthy.
Expected Result: The Elastic Defend integration will not be able to reach Fleet server and will be unhealthy. The Endpoint security logs in Fleet should contain a message like:
Expected Result: The upgrade should succeed and Elastic Defend should remain unhealthy afterwards.
Configure the proxy URL used in step 1 in the Fleet UI. Expected Result: The Elastic Defend integration should now be healthy.
Screen Recordings:
Build details:
8.7.0-BC6
BUILD: 61051
COMMIT: 04ef24287f26854ad99a46ae983854c6184717cb
Artifact Link: https://staging.elastic.co/8.7.0-a7fb3750/downloads/beats/elastic-agent/elastic-agent-8.7.0-linux-x86_64.tar.gz
Host OS: Linux .tar
Please let us know if we are missing anything here. Thanks
Related https://github.com/elastic/elastic-agent/issues/2390. When the agent is configured to use a proxy you can't tell from the diagnostics in an obvious way.
I tried reproducing this again, and I can confirm everything looks healthy with the proxy in use but I believe this is because with my basic proxy setup the internet is still reachable without going through the proxy. This means that even if the proxy isn't used things will still work.
https://github.com/elastic/elastic-agent/issues/2390 means it is hard to confirm whether a proxy is being used at all. I can confirm from some of the agent log messages that it is using the proxy, as far as I can tell endpoint isn't.
Configuring a proxy in the UI through the new proxy settings doesn't seem to change this.
In my test setup I killed my local proxy instance and the agent reported itself as unable to connect to Fleet but Endpoint kept working. This is even after configuring a proxy in Fleet in addition to using the --proxy-url
option at installation time. I don't think that the new proxy settings in 8.7.0 will be a work around for this problem unfortunately.
Folks, I've been trying to reproduce it but I cannot.
It seems to me the problem is when the --proxy-url
is passed to the agent install/enrol but the proxy is not defined in the policy. In this case the agent install and enrol, but then keeps trying to connect to fleet forever.
This are my tests and findings:
Setup environment:
vagrant up elastic-agent
vagrant ssh elastic-agent
Setup a proxy for an 8.7.0 Fleet cluster. Install the 8.6.2 agent and enroll it with the --proxy-url
command line parameter.
iptables -A INPUT -s 35.224.224.0/24 -j DROP
check your cloud IPcurl https://[YOUR_CLOUD_FLEET_URL]/api/status
should hangState: STARTING
Message: Waiting for initial configuration and composable variables
Fleet State: HEALTHY
Fleet Message: (no message)
Components: (none)
cat /opt/Elastic/Agent/data/elastic-agent-913c02/logs/elastic-agent-[TIMESTAMP].ndjson | grep error
2.1 Setup proxy and endpoint security after instalation
http://192.168.56.1:3128
as a proxy/opt/Elastic/Agent/elastic-agent inspect | grep -C 3 proxy
fleet:
hosts:
api_key: [REDACTED] hosts:
/opt/Elastic/Agent/elastic-agent status
iptables -A INPUT -s 35.224.224.0/24 -j DROP
It seems to me the problem is when the --proxy-url is passed to the agent install/enrol but the proxy is not defined in the policy
We spoke about this today. The problem is that the proxy_url received in the agent policy from Fleet always takes precedence over the one that was configured when the agent was installed with the --proxy-url
option, even when the proxy_url received from Fleet is empty.
The ability to configure the proxy from Fleet is a new feature in 8.7.0, so nobody has configured a proxy in Fleet from the start. This means upgrading to 8.7.0 will usually result in the proxy setup at installation time being unconditionally ignored.
The path to fix this for 8.7.1 is to use the following precedence rules for the proxy URL:
This will ensure that agents configured with a proxy at installation time continue to work while still allowing for the proxy to be changed from Fleet. This has the caveat that the proxy setup at installation time cannot be removed from Fleet at this time.
We will create a follow up issue to allow the agent to distinguish between the proxy URL received from Fleet being empty because it was never configured, and the proxy URL received from Fleet being empty because the user intended to remove all proxy configurations.
Filed https://github.com/elastic/kibana/issues/154482 to document that proxies setup when the agent was installed cannot be managed by Fleet.
This appears to be caused by https://github.com/elastic/elastic-agent/pull/2172, reverting it fixes the problem.
To reproduce this, install the agent with the
--proxy-url
option and add the Elastic Defend integration to the agent policy.Observe that endpoint security fails to connect to Fleet. Running
elastic-agent diagnostics
we should expect to see theproxy_url
key in thefleet
section of the endpoint unit configuration, and it will be missing.Example of the expected output. This can be obtained by reverting https://github.com/elastic/elastic-agent/pull/2172 and installing the agent. With https://github.com/elastic/elastic-agent/pull/2172 the
proxy_url
key is missing, it seems to be getting overwritten with the emptyproxy_url
received from Fleet.