wazuh / wazuh-qa

Wazuh - Quality Assurance
GNU General Public License v2.0
63 stars 30 forks source link

Inconsistent results in Vulnerability Detection performance tests #5565

Closed Rebits closed 4 weeks ago

Rebits commented 1 month ago

Description

The Release 4.8.1 - RC 2 - Vulnerability Detection performance test has identified unexpected discrepancies in binary statistics between iterations due to several factors:

Tasks

Validation

rafabailon commented 1 month ago

Pipeline Investigation

Jenkins

The Jenkins Pipeline has several functions for monitoring. They all make calls to Ansible Playbooks.

Pipeline Code: test_cluster.groovy

The monitoring functions are launchManagerMonitor, launchIndexerMonitor and launchDashboardMonitor. The Ansible Playbooks they use are manager_monitor.yaml, indexer_monitor.yaml and dashboard_monitoring.yaml respectively.

Playbooks simply activate the available monitoring tools:

These are the two Ansible instructions used to activate monitoring:

- name: Start wazuh-metrics monitor
  command: "wazuh-metrics {{ metrics_parameters }}"
  environment:
    PATH: "/usr/local/bin:${PATH}"
  async: "{{ monitor_time }}"
  poll: 0
  when: launch_metrics_monitor

- name: Start wazuh-statistics monitor
  command: "wazuh-statistics {{ stats_parameters }}"
  environment:
    PATH: "/usr/local/bin:${PATH}"
  async: "{{ monitor_time }}"
  poll: 0
  when: launch_stats_monitor

QA

The monitoring that is activated in Ansible Playbooks makes use of wazuh-metrics and wazuh-statistics, which are in the wazuh-qa repository.

For the monitoring itself, there are two Monitor classes depending on what is intended to be monitored.

There are several options that can be configured. However, these are the only options used in the pipeline.

String metrics_parameters = "-p ${metrics_process_name} -v ${_build_parameters.PKG_VERSION} -s ${monitor_sleep} " +
                            "-u ${data_unit} --store ${_global_parameters['binaries_store_path']}"
String stats_parameters = "-t ${stats_names} -s ${monitor_sleep} --store ${_global_parameters['stats_store_path']}"

This is the meaning of each of the options:

The values ​​that can be modified in the pipeline do not seem to be related to the problem described in the issue. The parameters that could modify the behavior of the monitor have a preset value.

rafabailon commented 1 month ago

On Hold due to Release 4.8.1 - RC 3

rafabailon commented 1 month ago

Pipeline Tests

The next step is to run the pipeline several times and compare the results to see if there are variations.

Analysis

Binary 582 583 584
CPU image image image
RSS image image image
Binary 582 583 584
CPU image image image
RSS image image image
Binary 582 583 584
CPU image image image
RSS image image image

Conclusions

I have reviewed the graphs and made some comparative tables to see differences between the three tests carried out. As can be seen in the table, there are no significant differences in monitoring. The graphs have a similar pattern and monitoring begins when indicated by the pipeline.

For example, these are the times in which monitoring begins according to the pipeline:

Monitor First Build Second Build Third Build
Manager 12:11:46 (10:11:46) 13:13:05 (11:13:05) 14:15:08 (12:15:08)
Dashboard 12:11:47 (10:11:47) 13:13:04 (11:13:04) 14:15:07 (12:15:07)
Indexer 12:11:46 (10:11:46) 13:13:04 (11:13:04) 14:15:08 (12:15:08)

The monitor start times match the monitoring start times on the graphs.

rafabailon commented 1 month ago

Example Builds

I have reviewed the example builds in the issue. First I checked the parameters used in the pipeline and they are the same (mostly) that I used in the tests. Due to the tests carried out, I have been able to verify that, with the same parameters, the results achieved are similar.

I have reviewed the monitoring as well. The timestamps that appear in the build match the timestamps on the graphs. The monitoring is activated correctly and begins its activity immediately. In the tests I have carried out the same behavior can be seen.

There does not appear to be a relationship between monitoring and the detected variations.

MARCOSD4 commented 1 month ago

LGTM

juliamagan commented 4 weeks ago

LGTM