wazuh / wazuh-qa

Wazuh - Quality Assurance
GNU General Public License v2.0
61 stars 30 forks source link

Performance for Vulnerability Detection module in clustered environments #5313

Closed Rebits closed 1 week ago

Rebits commented 2 weeks ago

Description

This issue is dedicated to conducting a thorough performance analysis of two proposed development approaches:

The objective is to perform performance tests and compare the results of both approaches. This comparative analysis will provide a comprehensive understanding of the potential impact on the product.

Test environment

Component Quantity Operating System CPU (cores) RAM (GB) Disk (GB)
Master 1 Ubuntu 22 4 8 50
Workers 2 Ubuntu 22 4 8 50
Agent 1 1 Ubuntu 22 2 4 30
Agent 2 1 Windows 11 2 4 30
Load Balancer 1 Ubuntu 22 4 8 50
Indexers 2 Ubuntu 22 2 4 30

[!NOTE] The load balancer is located on the master node.

23058 Development Packages

Architecture Framework development package URL URL
DEB 4.8.0-python.vd.spike.deb.1
RPM 4.8.0-python.vd.spike.rpm.1

22867 Development Packages

Architecture Core development package URL
DEB 4.8.0-0.commitd31b277
RPM 4.8.0-0.commitd31b277

Test Cases

Testing

Automatic

Methodology

Utilizing the CLUSTER-Workload_benchmarks_metrics pipeline to execute specified test cases automatically. Results will be manually analyzed and shared with the development team for validation adjustments.

Test Cases

Case Description Number of Agents EPS Frequency Number of Vulnerable Packages Time
Minimum Activity Simulate a small, stable environment with low activity 10 10 600 100 3h
Medium Activity Simulate a medium-sized environment with moderate activity 50 10 300 100 3h
High Activity Simulate a large-scale environment with significant activity 200 50 60 100 3h

Manual

Methodology

Customizing the set of vulnerable packages is not feasible in automatic testing. Therefore, manual testing will utilize a larger set of 10,000 vulnerabilities to identify any potential instability in environments with a high vulnerability count. The following Wazuh-QA tools will be employed for manual performance analysis:

Test Cases

Case Description Number of Agents EPS Frequency Number of Vulnerable Packages Time
High Vulnerability Environment Simulate an intermediate-sized environment with high vulnerability 10 10 60 10,000 3h

Conclusion :red_circle:

New Issues

Known issues

[!NOTE] Manual performance testing, Minimum Activity and High Activity has not been performed. More information in https://github.com/wazuh/wazuh-qa/issues/5313#issuecomment-2100349272

Rebits commented 1 week ago

Automatic

Rebits commented 1 week ago

Minimum Activity and High activity performance tests fail due to no space left error. Reported in https://github.com/wazuh/wazuh-jenkins/issues/6475

22:03:52  
22:03:52  TASK [Copy ossec.log file to data files] ***************************************
22:03:52  fatal: [CLUSTER-Workload_benchmarks_metrics_B510_manager_2]: UNREACHABLE! => {
22:03:52      "changed": false,
22:03:52      "unreachable": true
22:03:52  }
22:03:52  
22:03:52  MSG:
22:03:52  
22:03:52  Warning: Permanently added '172.31.3.110' (ECDSA) to the list of known hosts.

22:03:52  mkdir: cannot create directory ‘/tmp/ansible-tmp-1715115832.7137516-30912-167679972105845’: No space left on device
22:03:52  
22:03:53  fatal: [CLUSTER-Workload_benchmarks_metrics_B510_manager_1]: UNREACHABLE! => {
22:03:53      "changed": false,
22:03:53      "unreachable": true
22:03:53  }
22:03:53  
22:03:53  MSG:
22:03:53  
22:03:53  Warning: Permanently added '172.31.4.31' (ECDSA) to the list of known hosts.

22:03:53  mkdir: cannot create directory ‘/tmp/ansible-tmp-1715115832.724964-30911-242038256013694’: No space left on device

Only Medium Activity performance tests finished successfully Build: https://ci.wazuh.info/job/CLUSTER-Workload_benchmarks_metrics/511/

Rebits commented 1 week ago

Medium Activity :red_circle:

Build: https://ci.wazuh.info/job/CLUSTER-Workload_benchmarks_metrics/511/ Report: Artifact.zip

Logs :red_circle:

Summary

Master :yellow_circle:

Worker 1 :red_circle:

Worker 2 :yellow_circle:

Indexer 1 :green_circle:

No warnings or errors

Indexer 2 :green_circle:

No warnings or errors


Metrics :red_circle:

Summary

Master :green_circle:

Metrics ![CPU](https://github.com/wazuh/wazuh-qa/assets/11089305/4d248bc5-0742-41ac-8fc1-cd692b205138) ![Disk_Read](https://github.com/wazuh/wazuh-qa/assets/11089305/08b1383e-64bb-499e-beea-cfbccbb06b45) ![Disk_Read_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/3d95ad26-d424-4c6a-848d-4988c81f1f71) ![Disk_Write_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/95c4e10b-84c8-4052-b3c4-d51f0079a529) ![Disk_Written](https://github.com/wazuh/wazuh-qa/assets/11089305/f6148307-c6ee-418b-8510-26626c1020ff) ![FD](https://github.com/wazuh/wazuh-qa/assets/11089305/12f7e8ce-05fc-42e0-9357-41b51d5a24b8) ![PSS](https://github.com/wazuh/wazuh-qa/assets/11089305/cc62b19a-3b92-4f33-a46e-22fc2910b7a0) ![Read_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/ed7619ce-de58-4fc3-8587-1aab5f820a0d) ![RSS](https://github.com/wazuh/wazuh-qa/assets/11089305/48afbbde-8fd3-4d77-a8f2-94fd5349d95e) ![SWAP](https://github.com/wazuh/wazuh-qa/assets/11089305/ec161e62-5435-46d0-8bf0-68dfcd53aea0) ![USS](https://github.com/wazuh/wazuh-qa/assets/11089305/3b8b7f6e-e3f3-4808-87d0-22acb7699770) ![VMS](https://github.com/wazuh/wazuh-qa/assets/11089305/ad0ff505-fda3-40e6-a4e1-76bc81a6ac87) ![Write_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/dc2c2a17-1bd1-43e9-ac97-4a02e2ffea58)

Worker 1 :red_circle:

Metrics ![CPU](https://github.com/wazuh/wazuh-qa/assets/11089305/6a857f50-dc4f-418f-9758-8a912c70087e) ![Disk_Read](https://github.com/wazuh/wazuh-qa/assets/11089305/5e758eac-f295-4355-ad5e-a16695ff9dfa) ![Disk_Read_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/4d5221fd-5e2c-4334-8b88-a1d5afd69ef3) ![Disk_Write_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/983ba4bd-a20f-4688-b31e-3d8d0e9cdc37) ![Disk_Written](https://github.com/wazuh/wazuh-qa/assets/11089305/42d13d72-bd9a-4540-9930-ea485cacc0b1) ![FD](https://github.com/wazuh/wazuh-qa/assets/11089305/7ecf45fd-9eeb-44a3-9a25-6868475dee6f) ![PSS](https://github.com/wazuh/wazuh-qa/assets/11089305/38039e8a-047f-4622-9f96-9f2688433bec) ![Read_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/881f7bbd-2d25-4482-bf93-c264950a6db3) ![RSS](https://github.com/wazuh/wazuh-qa/assets/11089305/837bdce4-97a7-466f-bed0-b5f5170506c3) ![SWAP](https://github.com/wazuh/wazuh-qa/assets/11089305/7d2c8545-e518-40be-8326-e77c8869cccf) ![USS](https://github.com/wazuh/wazuh-qa/assets/11089305/78417a04-55d3-48d1-8cbd-e6030a72cd02) ![VMS](https://github.com/wazuh/wazuh-qa/assets/11089305/b687c128-86c8-4b04-99b2-ef047b9c9d23) ![Write_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/5a65edb6-98d6-4e3c-9604-1726d8113ebe)

Worker 2 :red_circle:

Metrics ![CPU](https://github.com/wazuh/wazuh-qa/assets/11089305/92b8da0c-28e7-41ad-985b-ebdaa3413f55) ![Read_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/4bf80760-05a2-4924-a33e-fcb024593aa1) ![RSS](https://github.com/wazuh/wazuh-qa/assets/11089305/420c9f5b-3c27-40a0-b28a-47dee92cfc6c) ![SWAP](https://github.com/wazuh/wazuh-qa/assets/11089305/e2fa2028-285a-47e4-bd87-6fd582da58ce) ![USS](https://github.com/wazuh/wazuh-qa/assets/11089305/1fcad186-9d0e-4aae-a52c-4bc9226c1d92) ![VMS](https://github.com/wazuh/wazuh-qa/assets/11089305/3601ce31-e8a1-41a2-a9c1-6baaed279042) ![Write_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/2a3275d3-1ee4-48af-be2c-36f2fc1a6cfe) ![Disk_Read](https://github.com/wazuh/wazuh-qa/assets/11089305/88328f02-0ae2-4dc5-bcaa-1d66858283b0) ![Disk_Read_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/9be3173c-fc7f-4ffb-a206-ed3d678d6507) ![Disk_Write_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/2fdbae2f-6358-40ea-9a97-83a865eaa594) ![Disk_Written](https://github.com/wazuh/wazuh-qa/assets/11089305/ff4bcb65-d810-4208-9ae2-4bdaafb452cb) ![FD](https://github.com/wazuh/wazuh-qa/assets/11089305/5d5fdedd-85e2-4284-b1e1-1e19afe4b16a) ![PSS](https://github.com/wazuh/wazuh-qa/assets/11089305/7dfaf325-d508-49b7-bd5f-de849b182d08)

Indexer 1 :green_circle:

No abnormal behavior detected

Metrics ![CPU](https://github.com/wazuh/wazuh-qa/assets/11089305/950421f2-6f77-49cb-b9fd-71916a4d8e1d) ![Disk_Read](https://github.com/wazuh/wazuh-qa/assets/11089305/da1c0e26-3997-4f35-bb95-c0b8afafe2e0) ![Disk_Read_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/41836d0f-2738-4d84-a86d-8549651d57b6) ![Disk_Write_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/136829e7-32ee-4478-aff5-7daa9235eeba) ![Disk_Written](https://github.com/wazuh/wazuh-qa/assets/11089305/4793e479-a09c-485d-84c7-e2b269cb243f) ![FD](https://github.com/wazuh/wazuh-qa/assets/11089305/4218b259-cd72-4d87-9738-670f034f45ce) ![PSS](https://github.com/wazuh/wazuh-qa/assets/11089305/851256d9-ea3d-416a-9b3d-565745b714e2) ![Read_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/cd77d750-e1b2-42d4-9023-bdd1b6c9e0e2) ![RSS](https://github.com/wazuh/wazuh-qa/assets/11089305/0cfe10bf-6755-4b29-8ca4-ee13d565f199) ![SWAP](https://github.com/wazuh/wazuh-qa/assets/11089305/f46712c5-34fa-4b23-a84d-1f1c6e3d2dd1) ![USS](https://github.com/wazuh/wazuh-qa/assets/11089305/20dee659-6699-4ca4-b566-36f8f1f82405) ![VMS](https://github.com/wazuh/wazuh-qa/assets/11089305/355268ac-f7f5-4ddb-82f2-c5645855acdb) ![Write_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/20748e73-0566-4c4f-9301-526f2f060f35)

Indexer 2 :green_circle:

No abnormal behavior detected

Metrics ![CPU](https://github.com/wazuh/wazuh-qa/assets/11089305/cd103c54-f6ea-4b42-9a9b-e656e0d74213) ![Disk_Read](https://github.com/wazuh/wazuh-qa/assets/11089305/c0700f96-2028-47f0-a0cb-8f5d65800e52) ![Disk_Read_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/5ab2ec6a-6cd8-45e8-9470-802ca875b545) ![Disk_Write_Speed](https://github.com/wazuh/wazuh-qa/assets/11089305/d3a0bd85-b4ae-495f-b7cf-b3b1ffbc68e7) ![Disk_Written](https://github.com/wazuh/wazuh-qa/assets/11089305/0f15752c-cfb6-4bfc-abb3-28451bfdade2) ![FD](https://github.com/wazuh/wazuh-qa/assets/11089305/8bb8018a-7502-4f1d-a4a6-d83dbda6b2fb) ![PSS](https://github.com/wazuh/wazuh-qa/assets/11089305/67aed832-c40d-4c6f-bd5c-e78a25242b98) ![Read_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/c2e80b80-c678-4fa2-b0be-3f9fd416cc75) ![RSS](https://github.com/wazuh/wazuh-qa/assets/11089305/088b12b9-38ec-4d5d-9619-25a3a4e04a76) ![SWAP](https://github.com/wazuh/wazuh-qa/assets/11089305/4a7d789d-8591-40d2-b831-9b61e5978a06) ![USS](https://github.com/wazuh/wazuh-qa/assets/11089305/7b8d74d6-de33-416b-b8ed-463e98f85090) ![VMS](https://github.com/wazuh/wazuh-qa/assets/11089305/950e4930-2a63-418a-bd23-bfe39a915d76) ![Write_Ops](https://github.com/wazuh/wazuh-qa/assets/11089305/9aa19181-53e0-47dd-babd-0c8e6764511a)

Statistics :green_circle:

Vulnerabilities State :green_circle:

The vulnerability generator module, utilized by the simulate agents script, is designed to transmit 100 vulnerable packages to the manager and subsequently confirm their removal. This behavior is visualized through sinuous graphics, reaching a peak with each repetition after processing all vulnerabilities.

In the plot, it's evident that the indexer connector fails to match the ideal expected graphics. However, it's apparent that the simulator is performing as intended.

total_vulnerabilities

Implementing various testing methods to determine if the final number of vulnerabilities aligns with expectations at specific points during the test could be highly beneficial.


Alerts :green_circle:

We anticipate that the alerts generated by both the workers and the manager should correspond with the indexed alert values. Nonetheless, there appears to be a discrepancy:

combined_and_new_total_alerts

Due to the high activity levels, some variance between the written alerts and indexed alerts is expected. However, it would be advantageous to incorporate testing methods to gradually mitigate this, thereby stabilizing the environment over time.


Evidence collection :red_circle:

It has been detected the following errors regarding the evidence-collection capabilities of the pipeline

Rebits commented 1 week ago

Following a discussion with @juliamagan, we've made the decision not to replicate the unsuccessful High Activity and Low Activity performance tests. Instead, these tests will be re-launched in RC2

MARCOSD4 commented 1 week ago

GJ, but the graphs of the indexer 1 metrics cannot be displayed, perhaps because of an error in writing the comment.

MARCOSD4 commented 1 week ago

LGTM

juliamagan commented 1 week ago

LGTM