[Testing] Add high-fidelity integration tests around mock agents + live Kibana API

Elastic Agent includes a powerful suite of integration tests that spin up Fleet Server and Kibana from snapshot builds to test functionality in a true-to-life environment. We should take cues from this setup and add a similar testing capability to Fleet Server.

References

One instance where tests like this could've helped us is https://github.com/elastic/fleet-server/issues/3263.

These tests would ideally allow us to place "mock" agents (similar to https://github.com/elastic/horde) into broken or erroneous states intentionally, then run those agents through various APIs and lifecycles to ensure they recover and are placed into a manageable state. Tests like these would allow us to make strides in our fault tolerance and self-healing capabilities around Fleet and Agent.

Alternatives

Could we simply expand the existing agent test suite with these capabilities rather than creating a new test infrastructure in Fleet Server?
Could we use Horde directly, or add additional test coverage in our existing scale/performance test suites to achieve this kind of coverage?

elastic / fleet-server

[Testing] Add high-fidelity integration tests around mock agents + live Kibana API #3279

References

Alternatives