Due to time constraints and the novelty of the problem, the initial scale testing effort was done using tools that were quicker to utilise at the time. In the long term, if we want to systematically repeat the tests from time-to-time, we would want to use tools and techniques that the entire team feels comfortable with. Some of the suggestions are:
Consider whether to continue using Terraform
Use Elastic products where appropriate to create a feedback loop
@elastic/cloud-k8s How valid is this old issue? Should we close this in favor of a more valid issue, or simply close as we feel good with our current approach?
Consider whether to continue using Terraform - We do not use terraform IIUC. There are no .tf files in this repo.
Use Elastic products where appropriate to create a feedback loop - This has some validity when it comes to load testing, as we could use EsRally, but I suspect we chose Vegeta, as it was written in Go, and could be used as a library, while ESRally is Python.
Run monitoring tools in a separate cluster - There could be some validity here.
Long term storage of metrics and logs - We store metrics, and logs in a separate E2E cluster currently.
Due to time constraints and the novelty of the problem, the initial scale testing effort was done using tools that were quicker to utilise at the time. In the long term, if we want to systematically repeat the tests from time-to-time, we would want to use tools and techniques that the entire team feels comfortable with. Some of the suggestions are: