Description

System tests are a type of software testing performed on a complete and integrated system to evaluate its compliance with specified requirements. These tests focus on verifying that the system functions as expected and that all its components work together correctly. The purpose of system testing is to validate the system's behavior and performance in a real-world environment.

Regarding Wazuh testing, these system tests will differ from the E2E tests in that they will not test the indexer or the dashboard. Instead, they will solely focus on testing the manager and agent components, as well as their integration with cloud environments.

Examples of system tests:

Monitoring a file on the agent generates the expected alerts in the manager's alert.log file.
The software installed on an agent is correctly detected and stored in the expected manager's database.
In a clustered environment with an agent, only the expected multi-groups are regenerated when a new group file is created in the environment.
Agents are successfully updated using WPK
All insecure configurations of an agent are correctly detected with SCA (Security Configuration Assessment), and alerts are generated correctly in the manager alerts.log.

Given all these characteristics and requirements, it is necessary to investigate the current state of system testing, as well as the approach adopted by the ongoing development and other possible solutions to implement these system tests.

To Do

[x] Conduct an examination of the current system tests.
[x] Conduct an analysis of the current system test framework in progress.
[x] Explore alternative approaches to the previous system test framework.
- [x] TerraTest
- [x] KitchenCI
[x] Propose improvement to the current system tests approach

Current approach

WazuhQA System Tests

Description

Currently, the system tests are based on pytest, utilizing a framework that incorporates Testinfra and Ansible playbooks. The testing environment relies on Docker containers. Before executing the tests, this Docker environment needs to be deployed and provisioned using Ansible. It's worth noting that the tests do not verify the correctness of the environment and simply connect to the instances using a hard-coded inventory stored in the repository.

The full analysis of the current design has been documented in #2440 (comment).

Deployment and Provision

For system testing, Docker containers are employed. However, the deployment and provisioning of these containers are carried out separately from the pytest process. Instead, it is necessary to manually launch the playbooks to deploy and provision the environment. Furthermore, once the testing is complete, it is essential to remove the environment.


flowchart TB

subgraph Localhost

Ansible --> Host1

Ansible --> Host2

Ansible --> Host3

Ansible --> ...

subgraph DockerContainer1

Host1

end

subgraph DockerContainer2

Host2

end

subgraph DockerContainer3

Host3

end

subgraph DockerContainerN

...

end

end

Testing

Once the environment is successfully deployed and provisioned, the system tests can be executed. These tests will establish connections to all the deployed instances using the Wazuh QA framework. The framework provides methods to perform basic operations, such as removing files, editing file content, and more, utilizing the testinfra testing framework.

The test structure resembles that of the integration tests, although it is not consistent across all tests.

flowchart TB

subgraph Localhost

pytest --> TestInfra/Ansible

TestInfra/Ansible --> Host1

TestInfra/Ansible --> Host2

TestInfra/Ansible --> Host3

TestInfra/Ansible --> ...

subgraph DockerContainer1

Host1

end

subgraph DockerContainer2

Host2

end

subgraph DockerContainer3

Host3

end

subgraph DockerContainerN

...

end

end

Jenkins automation

At present, there is no implemented pipeline or process to facilitate the testing procedure within the Jenkins environment. However, the only viable alternative is to deploy an EC2 node with the required resources. This EC2 node can then be utilized to deploy and provision the necessary environment and subsequently launch the system tests within the deployed Docker containers.

flowchart TB

subgraph Jenkins

JenkinsNode --> EC2

end

subgraph EC2

subgraph Deployment and Provision

Ansible --> Host1

Ansible --> Host2

Ansible --> Host3

Ansible --> ...

subgraph DockerEnv

subgraph DockerContainer1

Host1

end

subgraph DockerContainer2

Host2

end

subgraph DockerContainer3

Host3

end

subgraph DockerContainerN

...

end

end

end

subgraph TestLaunching

pytest --> TestInfra/Ansible

TestInfra/Ansible --> DockerEnv

subgraph DockerContainer1

Host1

end

subgraph DockerContainer2

Host2

end

subgraph DockerContainer3

Host3

end

subgraph DockerContainerN

...

end

end

end

Advantages

Deployment and Provisioning

Docker container-based environments offer easy deployment and provisioning.
Docker containers require minimal resources, enabling the deployment of more than 30 instances on a localhost.
Development becomes easier when utilizing the Docker container environment.

System Test Framework

The Testinfra-based framework facilitates easy reuse of playbooks.
Pytest provides error handling capabilities through the implementation of pytest teardown and tear-up logic.
The Testingra-based framework enables the creation of debug and reporting tools.
Offers fast execution.
Versatile in nature.
Streamlines the test development.

Disadvantages

Deployment and Provisioning

Limitations in operating system compatibility; currently, only Debian containers are supported. Additionally, Docker does not allow testing on macOS, Solaris, or Windows.
Inability to launch multiple system tests simultaneously on the same executor node, even with different environments.
Duplication of inventories and playbooks for provisioning each environment.
Lack of environment reusability for different system tests.
Lack of scalability.
Difficulty in effective implementation within the Jenkins environment.

System Test Framework

Limited development of high-level environment methods within the framework.
Inadequate handling of errors in Ansible playbook tasks.
Insufficient functionality for monitoring logs.
Poor reporting capabilities.
Inadequate error handling.
Tests do not ensure the correctness of the environment, which can lead to environmental contamination.

Current system test framework in progress

Description

The basic idea is to separate, after this change, starting from a structure similar to the system tests proposed in the QA repository but addressing their weaknesses. For this purpose, it has been proposed to decouple the tests from the Docker ecosystem, allowing them to be launched in any environment provided by an Ansible inventory. This will enable us to take advantage of the automation possibilities in Jenkins, allowing the tests to be executed on all the operating systems supported by Wazuh.

Furthermore, there is a plan to improve the current framework by enriching it to include higher-level functions that allow for a more organic interaction with the Wazuh environment. At the same time, additional functionalities will be incorporated into the framework to facilitate development and error reporting, such as a debugging system and an evidence collection fixture.

To achieve these objectives, a series of libraries were created in Jenkins, called WazuhQAEnvironment, which enable dynamic deployment and provisioning of instances according to the specific needs of each test. These instances can be of any operating system, including EC2, ECS, or local server instances in Vagrant.

This will make it possible to have a single process in Jenkins to deploy and provision instances, reducing the complexity of the pipelines and facilitating development.

Once the instances are deployed and provisioned, the system tests specified by pipeline parameters will be launched from the Jenkins node, using a custom inventory generated by the library itself to access the remote instances.

Deployment and Provision

System tests will utilize an Ansible inventory, which enables the framework to execute remote tasks on the hosts. Additionally, this approach will provides the developers the ability to set up a local environment for developing new tests easily.

Testing

We will conduct the testing using an enhanced version of our existing framework. For system tests, we will utilize an Ansible inventory that provides the necessary environment and credentials to access the instances.

By leveraging this inventory, our framework will employ custom classes built on top of Testinfra and Ansible playbooks. These classes will enable us to execute high-level operations on a Wazuh environment.

The structure of the tests will resemble that of the Integration tests, but with the addition of several key features. Firstly, we will integrate GitHub actions for framework validation, ensuring its reliability. Additionally, we will implement a debugging system and an evidence collector, which will facilitate error reporting and troubleshooting.

With these enhancements, we aim to streamline the testing process, improve error detection, and enhance the overall reliability of our framework.

Jenkins Automation

To ensure the efficient deployment and provisioning of instances, we employ the WazuhQAEnvironment class within the JenkinsCI environment. This class serves as a unified tool for deploying and provisioning all the instances required for testing purposes.

For instance deployment, our CI utilizes Terraform, a reliable infrastructure provisioning tool. It leverages Terraform to deploy EC2 and ECS instances, catering to a wide range of supported systems. You can refer to our comprehensive documentation for the complete list of supported systems. In the case of Solaris and macOS instances, our CI utilizes Vagrant to deploy these operating systems on local servers.

The aforementioned deployment logic is implemented separately in the DeployerPipeline repository. This pipeline orchestrates the deployment process, ensuring efficient execution and management of resources.

To provision the aforementioned WazuhQAEnvironment class, we rely on the wazuh-ansible repository. This repository provides the necessary components to install various Wazuh elements. Specifically, we utilize custom branches with the naming convention of "production branch name + -qa" to incorporate specialized roles or modifications to the Wazuh components, catering to the unique requirements of our testing environment.

graph LR;

subgraph CI Jenkins

subgraph Deploying

WazuhQAEnvironment-Deployment;

end

subgraph Provisioning

WazuhQAEnvironment-Provision;

end

subgraph Testing

AnsiblePytestTask;

end

end

subgraph Wazuh-Ansible

WazuhRoles;

end

subgraph Wazuh-QA

QARoles;

SystemTests

end

subgraph instances[AWS Instance]

n1(Instance1)

n2(Instance2)

n3(Instance3)

nN(InstanceN)

end

JenkinsNode --> Deploying --> instances

Deploying --> DeployInventory

JenkinsNode -->Provisioning;

WazuhQAEnvironment-Provision -- DeployInventory --> Wazuh-Ansible -.-> instances

WazuhQAEnvironment-Provision -.-> QARoles -.-> instances

WazuhQAEnvironment-Provision --> ProvisionInventory

AnsiblePytestTask -- ProvisionInventory --> SystemTests --> instances

Advantages

Deployment and Provisioning

Supports all operating systems.
Tests are decoupled from the infrastructure, improving flexibility.
Easy maintenance with Ansible role-based provisioning.
Allows local testing for convenience.
Simple deployment of development environments.
Scalable solution.
Enables reusable environments.
Multiple tests can be launched on the same executor node in different environments.

System Test Framework

The Testinfra-based framework enables easy reuse of playbooks.
Pytest provides error handling capabilities through pytest teardown and tear-up logic.
The Testingra-based framework facilitates the creation of debug and reporting tools.
Offers fast execution for efficient testing.
Versatile nature allows for various testing scenarios.
- Streamlines test development process.

Disadvantages

Deployment and Provisioning

Dependency on wazuh-ansible may introduce limitations.
Complexity involved in the deployment and provisioning process.

System Test Framework

The framework can be complex to set up and maintain.
Requires the development of a substantial framework for effective testing.

Approach using ansible

Currently we have some pipelines that could help us to deploy instances (Deployer pipeline) and provision them using wazuh-ansible. Besides, we should create our own ansible repo so we can add/fix or whatever we need about new systems/components without rely on another repo.

graph LR;

style v3 stroke:#f66,stroke-width:2px,stroke-dasharray: 5 5
style v4 stroke:#f66,stroke-width:2px,stroke-dasharray: 5 5

JenkinsNode -->qaenv;

subgraph env[JenkinsCI]

subgraph qaenv[WazuhQAEnv]

dep((Deployer))

subgraph prov[Provisioning]
       p1([wazuh-ansible])
       p2([wazuh-qa-ansible])
end

end

subgraph test[Testing]
       v1([builtin-taks])
       v2([custom-taks])
       v3([custom scripts])
       v4([pytest])
end

end

subgraph instances[AWS Instance]
       n1(Instance1)
       n2(Instance2)
       n3(Instance3)
       nN(InstanceN)
end

dep -.->instances
prov -->instances
JenkinsNode -->test;
test -->instances

Advantages

Deployment and Provisioning

Deployment easy to use
Multiple OS support
Support snapshots

System Test Framework

If we end up using pytest again, we are familiar with it and some cases/code could be reused

Disadvantages

Deployment and Provisioning

Currenlty's wazuh-ansible dependant
Provisioning can be hard to use

System Test Framework

As ansible lacks in testing, we should create custom tasks to suply the builtin ones. Lastly, we would need custom scripts and/or pytest to be able to perform a nice testing (making us using something else instead of just ansible)
Hard to create custom tasks. It is possible that some checks cannot be performed by just tasks.

Terratest

Terratest is a testing framework that is primarily optimized for Terraform. However, it can also be integrated (without Terraform) with Docker, AWS, Azure, Kubernetes, and other platforms. Ideally, Terratest should be integrated with Terraform for performing tests effectively.

Terraform

Terraform is a powerful tool that enables infrastructure provisioning and management as code, with strong community support. After reviewing the documentation and conducting some small tests, the following conclusions can be drawn:

Advantages

Customization and parameterization of infrastructure through a pipeline: This can be achieved by creating a template to define the desired infrastructure and allowing it to receive parameters provided by Jenkins through a script.
Execution of tests in a separate virtual machine: While it is possible, it would require a proof of concept (POC) to effectively test its feasibility.
Management of EC2 instances and cloud infrastructure: Terraform is designed specifically for managing cloud infrastructure, making this task possible.

Disadvantages

Rebuilding the entire testing framework from scratch would be necessary.
There would be a learning curve for the entire team.

Notes and Achievements

To determine if this approach would work, it is recommended to conduct a proof of concept (POC) for a specific use case. So far, the following tests have been conducted and achievements have been made:

Tested Terratest with Terraform and Docker.
Successfully retrieved information from within the infrastructure using the framework.
Integrated VirtualBox with Terraform to create infrastructure with virtual machines.

Note from @Rebits:

We have decided to discard this option due to the following reasons:

The migration from Python to Go would be costly.
The Go testing ecosystem is not as extensive as Pytest.

KitchenCI

KitchenCI is a software development and infrastructure test automation tool. It is designed to simplify the writing and execution of integration tests and verification of infrastructure configurations.

Advantages

Consistent and reproducible deployments: KitchenCI allows you to define infrastructure configurations and provisioning steps in a declarative manner. This ensures that your deployments are consistent across different environments, reducing the risk of configuration drift and ensuring reproducibility.
Automated provisioning: With KitchenCI, you can automate the provisioning of instances on various platforms such as AWS, Azure, or Vagrant. This saves time and effort by eliminating manual setup and configuration steps.
Integration with configuration management tools: KitchenCI integrates seamlessly with popular configuration management tools like Ansible. This enables you to provision instances and apply desired configurations using familiar tools and workflows.
Multi-environment support: KitchenCI supports testing and provisioning across multiple environments, such as development, staging, and production. You can define different suites or scenarios for each environment, allowing you to validate configurations and deployments before promoting them to production.
Fast feedback loop: KitchenCI enables you to quickly iterate on your infrastructure configurations and deployments. By automating the provisioning process and running tests in isolated environments, you can receive feedback on the success or failure of your deployments rapidly, enabling faster troubleshooting and refinement.
Scalability and flexibility: KitchenCI allows you to define and manage multiple instances in parallel, making it suitable for testing complex distributed systems or multi-node setups. You can easily scale your testing infrastructure as needed and run tests concurrently across different environments.

Disadvantages

Infrastructure requirements: KitchenCI relies on the availability of infrastructure resources such as virtual machines, remote hosts or containers to provision instances and run tests. This means you need to ensure the availability and management of these resources, which can add complexity and overhead to your testing environment.
Maintenance overhead: As your infrastructure evolves and configurations change, you need to maintain and update your KitchenCI configuration files accordingly. This can introduce additional overhead in terms of managing and updating the provisioning scripts, test suites, and configuration files.
Platform-specific considerations: Different platforms and providers may have their own quirks and limitations that you need to be aware of when using KitchenCI for deployment and provisioning. You may need to customize your configurations or scripts to accommodate platform-specific requirements.
Not very extended community/support/resources: When you need something not that standard you need to use some custom provisioner, for example. Which could not have official support. This relies in version limitations and errors that can't be solved because there is no information about them.
Dependency on external tools: KitchenCI relies on external tools such as configuration management tools, provisioning tools, and test frameworks. This introduces dependencies on these tools and their compatibility with different platforms, which can sometimes lead to versioning or compatibility issues.

Note from @Rebits:

We have decided to discard this option due to the following reasons:

Limited community support
Costly migration from python to ruby as testing language

30/05/2023

Meeting with @fcaffieri about current approach of system testing ecosystem. Pending update this issue with the conclusions and the improvements suggested in the meeting

31/05/2023

During the meeting with @fcaffieri, we identified several issues that need to be addressed with our current approach:

Addressing wazuh-ansible: We need to decide how to handle wazuh-ansible. We can consider options such as integrating it into our QA process, creating a separate repository, or incorporating it into Jenkins. We should have a discussion with the team to determine the best course of action.
Challenges with repository sharing: We are facing difficulties when sharing the repository other teams. To resolve this, we can explore sharing the necessary tools or copying them into the QA module to ensure smoother collaboration.
Managing Jenkins saturation: We need to tackle the issue of Jenkins becoming overloaded. With the addition of more testing processes, it is crucial to find a solution to prevent Jenkins from being overwhelmed.

In addition to addressing these issues, we discussed some valuable enhancements for our current framework:

Creation of a user-friendly web server: It would be beneficial to develop a web server or a similar solution that provides a more user-friendly interface for executing pipelines. This platform could also incorporate documentation and enable dynamic customization of test parameters, enhancing our testing capabilities.
Implementation of a Kibana dashboard: To better track and analyze our CI processes (nightly, weekly, release), we propose building a Kibana dashboard. This dashboard would record the reports generated by our tests and allow for easy comparison of results, including footprint, performance, and more, across different versions of our software.

By addressing these issues and implementing these enhancements, we can improve the efficiency and effectiveness of our development and testing processes.

wazuh / qa-system-framework

Research, Analysis, and Design of System Tests #26

Description

To Do

Current approach

WazuhQA System Tests

Description

Deployment and Provision

Testing

Jenkins automation

Advantages

Disadvantages

Current system test framework in progress

Description

Deployment and Provision

Testing

Jenkins Automation

Advantages

Deployment and Provisioning

System Test Framework

Disadvantages

Deployment and Provisioning

System Test Framework

Approach using ansible

Advantages

Deployment and Provisioning

System Test Framework

Disadvantages

Deployment and Provisioning

System Test Framework

Terratest

Terraform

Advantages

Disadvantages

Notes and Achievements

KitchenCI

Advantages

Disadvantages

30/05/2023

31/05/2023