Open joshdover opened 2 years ago
Related to this is a request to surface details about unenrolled agents that are attempting checkin in the UI: https://github.com/elastic/kibana/issues/132702
Return a more informative error message back to Elastic Agent that will stop the agent from executing / continuing to check in with Fleet Server
I would prefer not changing the behavior in the Elastic Agent loops, I'd assume this could be temporary depending on the setup.
Cache invalidated API keys in memory to avoid the need to check with Elasticsearch
This would not work in the context of multiple Fleet-Server.
Throttle logging for repeated invalid API key check-ins.
I will make a PR to reduce the number of log generated for this, I still think we need to see something in the log for now, but we don't need it to see if for all the Agent that will try to connect.
OH I see, this warning is actually coming from the ES layer. So in that case we could indeed have caching that would keep track of failed API Keys, for a period of time maybe 1 hour?
When agents are "force" unenrolled, they can continue to attempt to check-in with Fleet Server using invalid API keys. This can produce a lot of noise in the Fleet Server and Elasticsearch logs on each check in attempt by these agents. This is common scenario when using Agent inside VMs and containers where instances may be reverted to a snapshot or spun back up after being force unenrolled. These instances will create constant error logs and load in Elasticsearch when attempting to validate these invalidated keys:
Steps to reproduce this scenario:
Potential solutions to this problem (not mutually exclusive or exhaustive):