Open axw opened 1 month ago
I've been thinking how to implement functional tests. I explored 3 possible approaches:
TL;DR: my conclusion is that we may want to proceed with option 3.
Option 1 heavy leverages ESS, creating infrastructure using terraform, on top of a custom built Bash framework. We do not need leveraging ESS for functional testing and the framework looks built with that purpose. Moreover Bash flexibility comes at a great readability costs. Orchestration is also not clear to me, and documentation is lacking.
Option 2 uses Go tests, but is strictly focused on testing specific APM server behaviors. As we now rely on apm-data
plugin we may need to test interactions with different Elasticsearch versions and more complex interactions, which do not fit the current model (that starts a single stack against which runs all tests).
A third option would be to build a framework that looks similar to Option 2, with guarantees provided by Go, but with a scope more similar to Option 1. I created a simple stub in https://github.com/endorama/apm-server/blob/af3ca3744e4746d4c6d7f65a162927d6c9e19331/functionaltests/main_test.go#L31 using testcontainers
library and starting a specific Elasticsearch version in a container. The idea here would be to build the freedom to leverage single stack components, allowing us to express complex upgrade and assertion scenarios.
I have some concerns on this approach:
systemtests
, but that looks like are lacking the flexibility we would need to express complex casesThe third option looks the best to me if I think at future use cases (es adding further cases to this logic https://github.com/elastic/apm-server/pull/13678 is more difficult in Bash than in Go), and I think a test framework must be ergonomic enough to encourage use.
I see potential for convergence in the long run, but is out of scope and not sure how much weight should have in the decision.
depending on whether a cluster is created new or upgraded.
The upgrade part is going to be especially tricky IMO, because IIANM Elasticsearch will never allow an upgrade between a released version (e.g. 8.15.3) and an un-released one (SNAPSHOT). If this is correct, I have no idea on how we could run the test for an upgrade before a release is created.
Can we use BCs in cloud first region? I'm not sure is possible to upgrade to those though.
I'll recap the discussions from today about how to move forward. I discussed this with @axw, @1pkg and @inge4pres.
My current stub uses testcontainers
, but I was not aware that we had flakiness issues with it in systemtests
, so it does not look a great path forward.
Additionally, Andrew noticed that our customers mostly use ESS, not some Docker/Compose stack, and there are benefits in testing there.
Leveraging current smoke tests does not look the preferred path forward, as they are mostly Bash + CI and this greatly limits both expressiveness sin tests and reproducibility.
The current proposal would be to implement a new testing framework built on these principals:
This approach would go towards the convergence mentioned in my previous comment, and not using testcontainers
would help with converging earlier. The layers mentioned about should "swappable", so that we can mix infrastructure/fixtures/assertions as needed based on testing scenarios. This could potentially also extends to running tests with a Docker stack or ECK, but is out of scope at the moment.
We will also have to consider how to run tests in parallel, some tests may taint the Elasticsearch stack used in a way that does not make safe reusing it for other tests, and some may not. Is not clear how to address this in our design at the moment, but for efficiency would be interesting to be able to mark clusters as tainted for further tests reuse.
Regarding which tests cases to run, we have a set already mentioned in https://github.com/elastic/apm-server/issues/13898#issuecomment-2326277518 that we should include. Additionally we should include testing upgrade path from versions before 8.13.0 to 8.15 and 8.16.
As per our latest discussion, I created a stub of a first test on the new framework we discussed. You can see it here: https://github.com/endorama/apm-server/blob/3b4ec398e8715b9b61ede38cb84aa5928d241492/testing/functional/test1/main_test.go
The first test I'm aiming for is testing the upgrade path from 8.14.0 to 8.15.1 as defined by:
Upgrade 8.14.x to 8.15.1+ with defaults: ILM should continue to be used for old indices, DLM should be used for new indices
I also added a README to clarify the overall idea.
We should implement functional tests that verify ILM or Data Stream Lefecycle Management is used as expected according to user configuration, and depending on whether a cluster is created new or upgraded. See https://github.com/elastic/apm-server/issues/13898#issuecomment-2326277518