Closed olegsu closed 1 year ago
@olegsu Thanks for this detailed proposal!! I really like this proposal and the overall direction.
I think we already have a few tests using a mocked k8s client, so I'm not completely sure how the tests you're proposing differ in the aspect of what will they test and how would that be different from what we have.
Regarding all the testability improvements you proposed, they sound like a great step towards higher quality codebase.
Thank you @eyalkraft. You are right, we are using the Kubernetes mocks, also have AWS mocks, but they are all small unit tests. The suggestion is to have the ability to test complete flow, just with mocked data. For example, the test structure I suggested will start Cloudbeat and run a complete flow, including all the logic we have with OPA and other internal components that might change the data somewhere. In the end, we run the assertion against what is expected to be reported to Elasticsearch.
Component Test
Motivation
Tests are important, they ensure that different components of a system function correctly when working together. Without component tests, it is difficult to identify issues that arise from the interaction of different components, which can lead to problems in the overall functionality of the system.
There are two main types of tests that we use: unit tests and end-to-end tests. Unit tests are used to test small and specific sub-flows of Cloudbeat. These tests are quick and easy to run locally, and they are written as part of the development process. They can have some limitations. For example, they often rely on mocked data (such as simulated disk IO, network, or SDK calls) which can reduce the level of confidence we have in the test results. Also, unit tests only test small sub-flows of code, so they may not catch issues that arise when different components are combined.
On the other hand, our “end-to-end” tests (excluding Kibana) involve testing Cloudbeat by interacting with live APIs and reporting to live Elasticsearch instances. These tests are written in Python and collect data from ES to verify the results. End-to-end tests are more reliable because they test the system in a more realistic environment. However, there are also some drawbacks to using end-to-end tests. For example, they are written in python, which can make it harder for people to contribute to them. They are slower to run, executed in a blackbox, and more prone to flakiness, which can make debugging failed tests more challenging. Finally, running end-to-end tests locally is difficult.
Proposal
I would suggest adding another level of testing (cloudbeat component tests) that focuses on specific integrations (AWS, Kubernetes) of our code. These tests would use mocked clients (with real data), which would give us a better understanding of how our code would perform later.
One of the main benefits of this approach is the quick feedback loop. By writing these tests as part of our codebase and using the
go test
command, we can easily run them locally and get immediate results. This will make it easier to identify any issues or bugs that may arise.Another thing is the ability to debug more easily. Since these tests are part of the same stack, it will be easier to track down any problems and fix them. And while there is still the possibility of flakiness, it should be easier to identify and address locally.
These tests should be designed to mimic real-world scenarios as closely as possible, including factors such as memory consumption and CPU usage. This will give us a more accurate understanding of how our code will perform in a production environment.
Gaps
libbeat
dependencyinit
functions, and external dependenciesLibbeat
From what I know,
libbeat
does not provides any utilities related to testing.What we are looking for is a way to easily run the beat with a specific configuration, mock the connection to the fleet server, and control the lifecycle behavior of the beat (meaning, start it, stop it, and reconfigure it).
I would suggest having this addressed in the future as it requires more work from multiple teams.
Meanwhile, we need to find workarounds.
Globals,
init
functions, and dependenciesclear is better the clever
source: Practical Go
We don't have a lot of globals, one that makes it hard to run tests is
Factories
(resources/fetchersManager/factory.go).source: A theory of modern Go
Our factories use the
init
function to register themselves with theFactories
, which is a list of all the available fetcher factories. This makes it hard to write tests because we need to figure out a way to mock theFactories
so that we can test each fetcher individually. It would be easier to write tests if we could somehow isolate the fetchers from theFactories
during testing.Source “Learning Go An Idiomatic Approach to Real-World Go Programming”
Besides that, the
init
functions construct the factories with proper API clients such as AWS and Kubernetes. Since those clients initiated deeply in the code it is hard to mock them for complete flow testing.To create a
fetcher
object or any other object in Go for testing purposes, we have a few options available to us. One option is to declare a new constructor for each configuration option. Another option is to use a custom Config struct. The third option is to use the Functional Options Pattern.Out of these options, I recommend using the Functional Options Pattern because it allows us to have more control over the code, makes the code more explicit and easier to review, and makes it easier to inject dependencies for testing. Additionally, it includes built-in default behavior. You can find more information about these options here.
For example, this is how the
ElbFetcher
will look likeFinal results
This is just one example of how a test might be structured. Other tests might need different implementation, requirements, or testing tools.
References