wazuh / wazuh-qa

Wazuh - Quality Assurance
GNU General Public License v2.0
64 stars 30 forks source link

Increase E2E vulnerability detection scans timeout #5699

Closed Rebits closed 4 days ago

Rebits commented 2 weeks ago

Description

In Release 4.9.0 - RC 1 - Vulnerability Detection E2E tests it was detected several errors:

After research (https://github.com/wazuh/wazuh/issues/25363#issuecomment-2309806230) it was concluded that these errors were motivated due to a regression in the times of the indexer and for this reason it's necessary to increase the timeout of the initial tests scans and the timeout for collecting vulnerabilities in the syscollector case for each agent.

Tasks

Validation

Conclusion

https://github.com/wazuh/wazuh-qa/issues/5699#issuecomment-2326993520

Rebits commented 2 weeks ago

Increased timeout to:

PACKAGE_VULNERABILITY_SCAN_TIME = 150
TIMEOUT_PER_AGENT_VULNERABILITY_FIRST_SCAN = PACKAGE_VULNERABILITY_SCAN_TIME * 4

Currently testing new timeout: https://ci.wazuh.info/job/Test_e2e_system/349/ On hold until tests are finished.

Rebits commented 2 weeks ago

On hold due to no macOS are available: https://github.com/wazuh/wazuh/issues/25345

Rebits commented 2 weeks ago

Build: https://ci.wazuh.info/job/Test_e2e_system/357/ Report: Test_e2e_system_357_test_vulnerability_detector(1).zip

Analysis

The issue seems to persist after increasing the timeout to the values specified in https://github.com/wazuh/wazuh-qa/issues/5699#issuecomment-2309911774. Regarding the research done in https://github.com/wazuh/wazuh/issues/25363#issuecomment-2309806230, this is not a test issue. Even with a timeframe of more than 15 minutes, the vulnerabilities are not correctly indexed to the vulnerability states.

In order to verify if this is a regressión, it's planned to launch the same tests over the 4.8.2 version. At the same, I will try to debug the environment once the first syscollector scan test fails.

Currently On Hold in favor of https://github.com/wazuh/wazuh-jenkins/issues/6910, due to the limitations of macOS instances

Rebits commented 2 weeks ago
Rebits commented 1 week ago

Regarding the results in 4.8.2 and 4.9.0, it seems this issue is present in both versions. However, in 4.8.1, this error was not detected: https://github.com/wazuh/wazuh/issues/24594. No change in 4.8.2 can justify this discrepancy.

Due to the following evidence queue can determine:

Currently provisioning an environment with only agent1 to perform several analyses over the test and the indexer connector: https://ci.wazuh.info/job/Test_e2e_system/361/

In addition, to fully determine a regression it would be run the tests over 4.8.1https://ci.wazuh.info/job/Test_e2e_system/362/

Rebits commented 1 week ago

In order to test this development along with https://github.com/wazuh/wazuh-qa/issues/5698 it was created the branch tmp-fixes-4.9.0 that contains both branches.

I am currently testing over 4.9.0-rc2. Build: https://ci.wazuh.info/job/Test_e2e_system/366/

Rebits commented 1 week ago

Conclusion

It was identified that these tests were failing due to the indexer's limited result window (defaulted to 10,000). In previous versions of the feeds, fewer than 10,000 vulnerabilities were detected in the environment. However, as this number increased, the tests began failing, particularly for agents with a higher number of vulnerabilities (e.g., CentOS 7 agents).

To address this issue, it has been proposed to increase the maximum result window before pulling the vulnerabilities.

Build: https://ci.wazuh.info/job/Test_e2e_system/366/

jseg380 commented 1 week ago

Asked some questions in the comments: https://github.com/wazuh/wazuh-qa/pull/5712#discussion_r1743568365

jseg380 commented 1 week ago

Questions resolved successfully. LGTM