DTT2 - Bugs - Provision module - Same Os in agent and manager generates instability in the agent supply

pro-akim commented 1 month ago

Same Os in agent and manager generates instability in the agent supply.

Sometimes the agent is not installed directly, sometimes it is installed but does not connect to the manager

Running this test (https://github.com/wazuh/wazuh-qa/issues/5125#issuecomment-2042784252)

YAML file

``` version: 0.1 description: This workflow is used to test agents deployment por DDT1 PoC variables: agent-os: - linux-ubuntu-18.04-amd64 - linux-ubuntu-20.04-amd64 - linux-ubuntu-22.04-amd64 - linux-debian-10-amd64 - linux-debian-11-amd64 - linux-debian-12-amd64 - linux-oracle-9-amd64 manager-os: linux-ubuntu-22.04-amd64 infra-provider: vagrant working-dir: /tmp/dtt1-poc tasks: # Unique manager allocate task - task: "allocate-manager-{manager-os}" description: "Allocate resources for the manager." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: large - composite-name: "{manager-os}" - inventory-output: "{working-dir}/manager-{manager-os}/inventory.yaml" - track-output: "{working-dir}/manager-{manager-os}/track.yaml" cleanup: this: process with: path: python3 args: - modules/allocation/main.py - action: delete - track-output: "{working-dir}/manager-{manager-os}/track.yaml" # Unique agent allocate task - task: "allocate-agent-{agent}" description: "Allocate resources for the agent." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: small - composite-name: "{agent}" - inventory-output: "{working-dir}/agent-{agent}/inventory.yaml" - track-output: "{working-dir}/agent-{agent}/track.yaml" foreach: - variable: agent-os as: agent cleanup: this: process with: path: python3 args: - modules/allocation/main.py - action: delete - track-output: "{working-dir}/agent-{agent}/track.yaml" # Unique manager provision task - task: "provision-manager-{manager-os}" description: "Provision the manager." do: this: process with: path: python3 args: - modules/provision/main.py - inventory: "{working-dir}/manager-{manager-os}/inventory.yaml" - install: - component: wazuh-manager type: assistant version: 4.7.3 live: True depends-on: - "allocate-manager-{manager-os}" # Generic agent provision task - task: "provision-install-{agent}" description: "Provision resources for the {agent} agent." do: this: process with: path: python3 args: - modules/provision/main.py - inventory: "{working-dir}/agent-{agent}/inventory.yaml" - dependencies: - manager: "{working-dir}/manager-{manager-os}/inventory.yaml" - install: - component: wazuh-agent type: package version: 4.7.3 live: True depends-on: - "allocate-agent-{agent}" - "provision-manager-{manager-os}" foreach: - variable: agent-os as: agent # Generic agent test task - task: "run-agent-{agent}-tests" description: "Run tests install for the agent {agent}." do: this: process with: path: python3 args: - modules/testing/main.py - targets: - wazuh-1: "{working-dir}/manager-{manager-os}/inventory.yaml" - agent: "{working-dir}/agent-{agent}/inventory.yaml" - tests: "restart" - component: "agent" - wazuh-version: "4.7.3" - wazuh-revision: "40714" - live: "True" foreach: - variable: agent-os as: agent depends-on: - "provision-install-{agent}" ```

The following error was found

[ERROR] [Testing]: agent-linux-ubuntu-2204-amd64 is not present in agent_control information

This error happens because when the agent is not installed (absence of client.key), the test module takes the name of the os and removes the "." and uses this for the agent's name in the validation.

On the other hand

Running the test by using (https://github.com/wazuh/wazuh-qa/issues/5125#issuecomment-2042784252)

YAML file

``` version: 0.1 description: This workflow is used to test agents deployment por DDT1 PoC variables: agent-os: - linux-ubuntu-18.04-amd64 - linux-ubuntu-20.04-amd64 - linux-ubuntu-22.04-amd64 - linux-debian-10-amd64 - linux-debian-11-amd64 - linux-debian-12-amd64 - linux-oracle-9-amd64 manager-os: linux-ubuntu-22.04-amd64 infra-provider: vagrant working-dir: /tmp/dtt1-poc tasks: # Unique manager allocate task - task: "allocate-manager-{manager-os}" description: "Allocate resources for the manager." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: large - composite-name: "{manager-os}" - inventory-output: "{working-dir}/manager-{manager-os}/inventory.yaml" - track-output: "{working-dir}/manager-{manager-os}/track.yaml" cleanup: this: process with: path: python3 args: - modules/allocation/main.py - action: delete - track-output: "{working-dir}/manager-{manager-os}/track.yaml" # Unique agent allocate task - task: "allocate-agent-{agent}" description: "Allocate resources for the agent." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: small - composite-name: "{agent}" - inventory-output: "{working-dir}/agent-{agent}/inventory.yaml" - track-output: "{working-dir}/agent-{agent}/track.yaml" foreach: - variable: agent-os as: agent cleanup: this: process with: path: python3 args: - modules/allocation/main.py - action: delete - track-output: "{working-dir}/agent-{agent}/track.yaml" # Unique manager provision task - task: "provision-manager-{manager-os}" description: "Provision the manager." do: this: process with: path: python3 args: - modules/provision/main.py - inventory: "{working-dir}/manager-{manager-os}/inventory.yaml" - install: - component: wazuh-manager type: assistant version: 4.7.3 live: True depends-on: - "allocate-manager-{manager-os}" # Generic agent provision task - task: "provision-install-{agent}" description: "Provision resources for the {agent} agent." do: this: process with: path: python3 args: - modules/provision/main.py - inventory: "{working-dir}/agent-{agent}/inventory.yaml" - dependencies: - manager: "{working-dir}/manager-{manager-os}/inventory.yaml" - install: - component: wazuh-agent type: package version: 4.7.3 live: True depends-on: - "allocate-agent-{agent}" - "provision-manager-{manager-os}" foreach: - variable: agent-os as: agent # Generic agent test task - task: "run-agent-{agent}-tests" description: "Run tests install for the agent {agent}." do: this: process with: path: python3 args: - modules/testing/main.py - targets: - wazuh-1: "{working-dir}/manager-{manager-os}/inventory.yaml" - agent: "{working-dir}/agent-{agent}/inventory.yaml" - tests: "uninstall" - component: "agent" - wazuh-version: "4.7.3" - wazuh-revision: "40714" - live: "True" foreach: - variable: agent-os as: agent depends-on: - "provision-install-{agent}" ```

It was possible fo find

The agent was installed, but it was not connected to the manager.

This instability can happen due to naming conflicts in the WF/Provision or Wazuh while 2 hosts have the same name.

Further research should be done

mhamra commented 1 month ago

UPDATE

Running the reported workflow file, the workflow failed reporting this error:

[2024-04-11 11:34:22] [ERROR] [57744] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-18.04-amd64-tests] Task failed with error: Error executing process task Traceback (most recent call last):
  File "/home/marcelo/wazuh/wazuh-qa/deployability/modules/testing/main.py", line 30, in <module>
    Tester.run(InputPayload(**vars(parse_arguments())))
  File "/home/marcelo/wazuh/wazuh-qa/deployability/modules/testing/testing.py", line 53, in run
    extra_vars['current_user'] = os.getlogin()
OSError: [Errno 6] No such device or address

I reproduced the problem using python:

python
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> print(os.getlogin())
Traceback (most recent call last):
  File "<stdin>", line 1, in `<module>`
OSError: [Errno 6] No such device or address
>>> import getpass
>>> getpass.getuser()
'marcelo'

The os.getlogin() returns the name of the user logged in on the controlling terminal of the process. Typically processes in the user's session (tty, X session) have a controlling terminal. Processes spawned by the workflow, do not have a controlling terminal. The recommended way to obtain the current user is by using getpass.getuser()

mhamra commented 1 month ago

UPDATE

After replacing the os.getlogin function by getpass.getuser, I've rerun the workflow file. This time, the workflow did not raise the exception and got stuck executing the test for the ubuntu-22.04 agent

Workflow console output and workflow.log file

```console (dtt-test) marcelo@marcelo-B460-AORUS-PRO-AC:~/wazuh/wazuh-qa/deployability$ python -m workflow_engine 5193-unstable-test.yml [2024-04-11 11:58:59] [INFO] [81388] [MainThread] [workflow_engine]: Executing DAG tasks. [2024-04-11 11:58:59] [INFO] [81388] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-11 11:58:59] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 11:59:45] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 45.92 seconds. [2024-04-11 11:59:45] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Starting task. [2024-04-11 12:00:31] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Finished task in 45.89 seconds. [2024-04-11 12:00:31] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Starting task. [2024-04-11 12:01:20] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Finished task in 49.28 seconds. [2024-04-11 12:01:20] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 12:06:39] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Finished task in 319.33 seconds. [2024-04-11 12:06:39] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Starting task. [2024-04-11 12:07:47] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Finished task in 67.95 seconds. [2024-04-11 12:07:47] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Starting task. [2024-04-11 12:08:33] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Finished task in 45.35 seconds. [2024-04-11 12:08:33] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Starting task. [2024-04-11 12:09:18] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Finished task in 44.89 seconds. [2024-04-11 12:09:18] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Starting task. [2024-04-11 12:10:14] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Finished task in 56.04 seconds. [2024-04-11 12:10:14] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 12:26:41] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-manager-linux-ubuntu-22.04-amd64] Finished task in 987.11 seconds. [2024-04-11 12:26:41] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-18.04-amd64] Starting task. [2024-04-11 12:27:37] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-18.04-amd64] Finished task in 56.18 seconds. [2024-04-11 12:27:37] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-20.04-amd64] Starting task. [2024-04-11 12:28:40] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-20.04-amd64] Finished task in 62.98 seconds. [2024-04-11 12:28:40] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 12:28:41] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-22.04-amd64] Finished task in 1.46 seconds. [2024-04-11 12:28:41] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-10-amd64] Starting task. [2024-04-11 12:29:15] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-10-amd64] Finished task in 33.38 seconds. [2024-04-11 12:29:15] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-11-amd64] Starting task. [2024-04-11 12:30:20] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-11-amd64] Finished task in 65.43 seconds. [2024-04-11 12:30:20] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-12-amd64] Starting task. [2024-04-11 12:31:09] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-12-amd64] Finished task in 48.49 seconds. [2024-04-11 12:31:09] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-oracle-9-amd64] Starting task. [2024-04-11 12:34:32] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-oracle-9-amd64] Finished task in 203.35 seconds. [2024-04-11 12:34:32] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-18.04-amd64-tests] Starting task. [2024-04-11 12:34:52] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-18.04-amd64-tests] Finished task in 19.41 seconds. [2024-04-11 12:34:52] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-20.04-amd64-tests] Starting task. [2024-04-11 12:35:10] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-20.04-amd64-tests] Finished task in 18.49 seconds. [2024-04-11 12:35:10] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-22.04-amd64-tests] Starting task. ^C[2024-04-11 13:50:44] [ERROR] [81388] [MainThread] [workflow_engine]: User interrupt detected. End process... [2024-04-11 13:50:50] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-22.04-amd64-tests] Finished task in 4539.75 seconds. [2024-04-11 13:50:50] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-10-amd64-tests] Starting task. [2024-04-11 13:51:07] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-10-amd64-tests] Finished task in 16.93 seconds. [2024-04-11 13:51:07] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-11-amd64-tests] Starting task. [2024-04-11 13:51:24] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-11-amd64-tests] Finished task in 16.74 seconds. [2024-04-11 13:51:24] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-12-amd64-tests] Starting task. [2024-04-11 13:51:41] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-12-amd64-tests] Finished task in 17.06 seconds. [2024-04-11 13:51:41] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-oracle-9-amd64-tests] Starting task. [2024-04-11 13:52:00] [INFO] [81388] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-oracle-9-amd64-tests] Finished task in 18.96 seconds. [2024-04-11 13:52:00] [ERROR] [81388] [MainThread] [workflow_engine]: User interrupt detected. Aborting execution... [2024-04-11 13:52:00] [INFO] [81388] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-11 13:52:00] [INFO] [81388] [MainThread] [workflow_engine]: Executing Reverse DAG tasks. [2024-04-11 13:52:00] [INFO] [81388] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-11 13:52:00] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Starting task. [2024-04-11 13:52:05] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Finished task in 4.93 seconds. [2024-04-11 13:52:05] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Starting task. [2024-04-11 13:52:09] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Finished task in 4.71 seconds. [2024-04-11 13:52:09] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 13:52:14] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Finished task in 4.65 seconds. [2024-04-11 13:52:14] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Starting task. [2024-04-11 13:52:19] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Finished task in 4.91 seconds. [2024-04-11 13:52:19] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Starting task. [2024-04-11 13:52:23] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Finished task in 4.31 seconds. [2024-04-11 13:52:23] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Starting task. [2024-04-11 13:52:28] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Finished task in 4.37 seconds. [2024-04-11 13:52:28] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Starting task. [2024-04-11 13:52:32] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Finished task in 4.77 seconds. [2024-04-11 13:52:32] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 13:52:37] [INFO] [81388] [ThreadPoolExecutor-2_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 4.70 seconds. ``` #### workflow.log file [workflow.log](https://github.com/wazuh/wazuh-qa/files/14949465/workflow.log)

The workflow log file shows an authentication error. The virtual machine was hanging, but the workflow did not throw an exception. After pressing CTRL-c, the workflow aborted the task and continued with the following.

2024-04-11 12:35:14,322] [ERROR] [Testing]: Authentication error. Check SSH credentials in ubuntu-22.04
[2024-04-11 13:50:44,198] [ERROR] [81388] [MainThread] [workflow_engine]: User interrupt detected. End process...

mhamra commented 1 month ago

UPDATE

Running the workflow in AWS provider.

I modified the workflow file and changed the vagrant provider for AWS. I could not reproduce the issue reported by @pro-akim.

Note that I didn't modify the provision module; I didn't add a delay at the start of the Provision.run method.

Modified workflow file

``` version: 0.1 description: This workflow is used to test agents deployment por DDT1 PoC variables: agent-os: - linux-ubuntu-18.04-amd64 - linux-ubuntu-20.04-amd64 - linux-ubuntu-22.04-amd64 - linux-debian-10-amd64 - linux-debian-11-amd64 - linux-debian-12-amd64 - linux-oracle-9-amd64 manager-os: linux-ubuntu-22.04-amd64 infra-provider: aws working-dir: /tmp/dtt1-poc tasks: # Unique manager allocate task - task: "allocate-manager-{manager-os}" description: "Allocate resources for the manager." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: large - composite-name: "{manager-os}" - inventory-output: "{working-dir}/manager-{manager-os}/inventory.yaml" - track-output: "{working-dir}/manager-{manager-os}/track.yaml" - label-termination-date: "1d" - label-team: "qa" cleanup: this: process with: path: python3 args: - modules/allocation/main.py - action: delete - track-output: "{working-dir}/manager-{manager-os}/track.yaml" # Unique agent allocate task - task: "allocate-agent-{agent}" description: "Allocate resources for the agent." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: small - composite-name: "{agent}" - inventory-output: "{working-dir}/agent-{agent}/inventory.yaml" - track-output: "{working-dir}/agent-{agent}/track.yaml" - label-termination-date: "1d" - label-team: "qa" foreach: - variable: agent-os as: agent cleanup: this: process with: path: python3 args: - modules/allocation/main.py - action: delete - track-output: "{working-dir}/agent-{agent}/track.yaml" # Unique manager provision task - task: "provision-manager-{manager-os}" description: "Provision the manager." do: this: process with: path: python3 args: - modules/provision/main.py - inventory: "{working-dir}/manager-{manager-os}/inventory.yaml" - install: - component: wazuh-manager type: assistant version: 4.7.3 live: True depends-on: - "allocate-manager-{manager-os}" # Generic agent provision task - task: "provision-install-{agent}" description: "Provision resources for the {agent} agent." do: this: process with: path: python3 args: - modules/provision/main.py - inventory: "{working-dir}/agent-{agent}/inventory.yaml" - dependencies: - manager: "{working-dir}/manager-{manager-os}/inventory.yaml" - install: - component: wazuh-agent type: package version: 4.7.3 live: True depends-on: - "allocate-agent-{agent}" - "provision-manager-{manager-os}" foreach: - variable: agent-os as: agent # Generic agent test task - task: "run-agent-{agent}-tests" description: "Run tests install for the agent {agent}." do: this: process with: path: python3 args: - modules/testing/main.py - targets: - wazuh-1: "{working-dir}/manager-{manager-os}/inventory.yaml" - agent: "{working-dir}/agent-{agent}/inventory.yaml" - tests: "restart" - component: "agent" - wazuh-version: "4.7.3" - wazuh-revision: "40714" - live: "True" foreach: - variable: agent-os as: agent depends-on: - "provision-install-{agent}" ```

Results

#### Workflow engine console output ```console (dtt-test) marcelo@marcelo-B460-AORUS-PRO-AC:~/wazuh/wazuh-qa/deployability$ python -m workflow_engine 5193-unstable-test.yml [2024-04-11 21:56:51] [INFO] [132213] [MainThread] [workflow_engine]: Executing DAG tasks. [2024-04-11 21:56:51] [INFO] [132213] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-11 21:56:51] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 21:57:13] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 22.30 seconds. [2024-04-11 21:57:13] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Starting task. [2024-04-11 21:57:37] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Finished task in 23.19 seconds. [2024-04-11 21:57:37] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Starting task. [2024-04-11 21:57:59] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Finished task in 22.94 seconds. [2024-04-11 21:57:59] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 21:58:22] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Finished task in 22.51 seconds. [2024-04-11 21:58:22] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Starting task. [2024-04-11 21:58:44] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Finished task in 22.30 seconds. [2024-04-11 21:58:44] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Starting task. [2024-04-11 21:59:07] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Finished task in 22.22 seconds. [2024-04-11 21:59:07] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Starting task. [2024-04-11 21:59:29] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Finished task in 22.46 seconds. [2024-04-11 21:59:29] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Starting task. [2024-04-11 21:59:51] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Finished task in 22.03 seconds. [2024-04-11 21:59:51] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 22:04:40] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-manager-linux-ubuntu-22.04-amd64] Finished task in 289.19 seconds. [2024-04-11 22:04:40] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-18.04-amd64] Starting task. [2024-04-11 22:06:11] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-18.04-amd64] Finished task in 90.99 seconds. [2024-04-11 22:06:11] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-20.04-amd64] Starting task. [2024-04-11 22:08:08] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-20.04-amd64] Finished task in 117.13 seconds. [2024-04-11 22:08:08] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 22:10:02] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-ubuntu-22.04-amd64] Finished task in 113.54 seconds. [2024-04-11 22:10:02] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-10-amd64] Starting task. [2024-04-11 22:11:34] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-10-amd64] Finished task in 92.34 seconds. [2024-04-11 22:11:34] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-11-amd64] Starting task. [2024-04-11 22:13:11] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-11-amd64] Finished task in 96.79 seconds. [2024-04-11 22:13:11] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-12-amd64] Starting task. [2024-04-11 22:14:36] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-debian-12-amd64] Finished task in 84.92 seconds. [2024-04-11 22:14:36] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-oracle-9-amd64] Starting task. [2024-04-11 22:16:16] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [provision-install-linux-oracle-9-amd64] Finished task in 99.66 seconds. [2024-04-11 22:16:16] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-18.04-amd64-tests] Starting task. [2024-04-11 22:17:16] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-18.04-amd64-tests] Finished task in 60.46 seconds. [2024-04-11 22:17:16] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-20.04-amd64-tests] Starting task. [2024-04-11 22:18:16] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-20.04-amd64-tests] Finished task in 59.40 seconds. [2024-04-11 22:18:16] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-22.04-amd64-tests] Starting task. [2024-04-11 22:19:12] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-ubuntu-22.04-amd64-tests] Finished task in 56.42 seconds. [2024-04-11 22:19:12] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-10-amd64-tests] Starting task. [2024-04-11 22:20:09] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-10-amd64-tests] Finished task in 56.33 seconds. [2024-04-11 22:20:09] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-11-amd64-tests] Starting task. [2024-04-11 22:21:04] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-11-amd64-tests] Finished task in 55.32 seconds. [2024-04-11 22:21:04] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-12-amd64-tests] Starting task. [2024-04-11 22:22:01] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-debian-12-amd64-tests] Finished task in 57.09 seconds. [2024-04-11 22:22:01] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-oracle-9-amd64-tests] Starting task. [2024-04-11 22:23:00] [INFO] [132213] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-agent-linux-oracle-9-amd64-tests] Finished task in 58.63 seconds. [2024-04-11 22:23:00] [INFO] [132213] [MainThread] [workflow_engine]: Executing Reverse DAG tasks. [2024-04-11 22:23:00] [INFO] [132213] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-11 22:23:00] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Starting task. [2024-04-11 22:24:19] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-ubuntu-18.04-amd64] Finished task in 79.58 seconds. [2024-04-11 22:24:19] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Starting task. [2024-04-11 22:25:24] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-ubuntu-20.04-amd64] Finished task in 64.38 seconds. [2024-04-11 22:25:24] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 22:25:57] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-ubuntu-22.04-amd64] Finished task in 33.74 seconds. [2024-04-11 22:25:57] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Starting task. [2024-04-11 22:27:02] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-debian-10-amd64] Finished task in 64.22 seconds. [2024-04-11 22:27:02] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Starting task. [2024-04-11 22:28:21] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-debian-11-amd64] Finished task in 79.66 seconds. [2024-04-11 22:28:21] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Starting task. [2024-04-11 22:29:25] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-debian-12-amd64] Finished task in 64.00 seconds. [2024-04-11 22:29:25] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Starting task. [2024-04-11 22:30:15] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-agent-linux-oracle-9-amd64] Finished task in 49.33 seconds. [2024-04-11 22:30:15] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-11 22:30:48] [INFO] [132213] [ThreadPoolExecutor-1_0] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 33.64 seconds. ``` workflow log file: [workflow.log](https://github.com/wazuh/wazuh-qa/files/14953080/workflow.log)

mhamra commented 1 month ago

UPDATE

I've modified the original vagrant test, keeping only two agents in the agent list. I've also turned off the cleanup section to keep the VMs running after finishing the workflow execution.

variables:
  agent-os:
    - linux-ubuntu-18.04-amd64
    - linux-ubuntu-20.04-amd64
  manager-os: linux-ubuntu-22.04-amd64
  infra-provider: vagrant
  working-dir: /tmp/dtt1-poc

Results

I've reproduced the error reported by @pro-akim. In this workflow.log file, this message shows the provisioning error:

[2024-04-12 12:52:51,761] [INFO] [Testing]: Getting status of ubuntu-22.04
[2024-04-12 12:52:52,127] [ERROR] [Testing]: agent-linux-ubuntu-2204-amd64 is not present in agent_control information
[2024-04-12 12:52:52,680] [DEBUG] ANSIBLE: Playbook [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Test restart for agent', 'command': "python3 -m pytest modules/testing/tests/test_agent/test_restart.py  -v --wazuh_version=4.7.3 --wazuh_revision=40714  --component=agent --dependencies='{}' --targets='{wazuh-1: /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml, agent: /tmp/dtt1-poc/agent-linux-

These entries found in the wazuh manager's ossec.log file show the problem:

2024/04/12 17:04:16 wazuh-authd: ERROR: Invalid agent name ubuntu-jammy (same as manager)
2024/04/12 17:05:16 wazuh-authd: INFO: New connection from 192.168.57.4
2024/04/12 17:05:16 wazuh-authd: INFO: Received request for a new agent (ubuntu-jammy) from: 192.168.57.4
2024/04/12 17:05:16 wazuh-authd: ERROR: Invalid agent name ubuntu-jammy (same as manager)
2024/04/12 17:06:16 wazuh-authd: INFO: New connection from 192.168.57.4
2024/04/12 17:06:16 wazuh-authd: INFO: Received request for a new agent (ubuntu-jammy) from: 192.168.57.4
2024/04/12 17:06:16 wazuh-authd: ERROR: Invalid agent name ubuntu-jammy (same as manager)
2024/04/12 17:07:16 wazuh-authd: INFO: New connection from 192.168.57.4
2024/04/12 17:07:16 wazuh-authd: INFO: Received request for a new agent (ubuntu-jammy) from: 192.168.57.4
2024/04/12 17:07:16 wazuh-authd: ERROR: Invalid agent name ubuntu-jammy (same as manager)

Conclusion

The provision fails because the manager and the agent have the same hostname. The hostname assigned by the allocator is the default hostname of the VM's image. This assignment duplicates hostnames and must be avoided.

pro-akim commented 1 month ago

Update

It would be useful to have knowledge of the criteria that the @wazuh/devel-devops team will use to apply the same nomenclature in the test module when establishing the naming of the agents

rauldpm commented 1 month ago

@mhamra please change the status to blocked until https://github.com/wazuh/wazuh-qa/issues/5214 is completed

@fcaffieri After talking with @davidjiglesias, we will move this issue from High impact bug to DTT2 as a bug (as it depends on the DevOps issue)

wazuh / wazuh-qa