wazuh / wazuh-qa

Wazuh - Quality Assurance
GNU General Public License v2.0
60 stars 30 forks source link

DTT1-The allocator assigns the same IP address to many instances launching tasks in parallel #5237

Closed mhamra closed 1 week ago

mhamra commented 2 weeks ago
Target version Related issue Related PR/dev branch
4.9.0 4495-dtt1-release

Description

Running this workflow file with the --threads 3

workflow.yaml ``` version: 0.1 description: This workflow is used to test manager deployment for DDT1 PoC variables: manager-os: - linux-ubuntu-20.04-amd64 - linux-ubuntu-22.04-amd64 - linux-oracle-9-amd64 # - linux-amazon-2-amd64 # - linux-redhat-7-amd64 # - linux-redhat-8-amd64 # - linux-redhat-9-amd64 # - linux-centos-7-amd64 # - linux-centos-8-amd64 # - linux-debian-10-amd64 # - linux-debian-11-amd64 # - linux-debian-12-amd64 infra-provider: vagrant working-dir: /tmp/dtt1-poc tasks: # Unique manager allocate task - task: "allocate-manager-{manager}" description: "Allocate resources for the manager." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: large - composite-name: "{manager}" - inventory-output: "{working-dir}/manager-{manager}/inventory.yaml" - track-output: "{working-dir}/manager-{manager}/track.yaml" on-error: "abort-all" foreach: - variable: manager-os as: manager # Generic manager test task - task: "run-manager-tests" description: "Run tests install for the manager." do: this: process with: path: python3 args: - modules/testing/main.py - targets: - wazuh-1: "{working-dir}/manager-linux-ubuntu-20.04-amd64/inventory.yaml" - wazuh-2: "{working-dir}/manager-linux-ubuntu-22.04-amd64/inventory.yaml" - wazuh-3: "{working-dir}/manager-linux-oracle-9-amd64/inventory.yaml" # - wazuh-4: "{working-dir}/manager-linux-centos-7-amd64/inventory.yaml" # - wazuh-5: "{working-dir}/manager-linux-amazon-2-amd64/inventory.yaml" # - wazuh-6: "{working-dir}/manager-linux-redhat-7-amd64/inventory.yaml" # - wazuh-7: "{working-dir}/manager-linux-redhat-8-amd64/inventory.yaml" # - wazuh-8: "{working-dir}/manager-linux-redhat-9-amd64/inventory.yaml" # - wazuh-9: "{working-dir}/manager-linux-centos-8-amd64/inventory.yaml" # - wazuh-10: "{working-dir}/mmarcelo@marcelo-B460-AORUS-PRO-AC:~/wazuh/wazuh-qa$ tail -f /tmp/workflow.log [2024-04-17 12:15:59,960] [INFO] [744222] [MainThread] [workflow_engine]: Executing DAG tasks. [2024-04-17 12:15:59,960] [INFO] [744222] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-17 12:15:59,960] [INFO] [744222] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Starting task. [2024-04-17 12:15:59,961] [INFO] [744222] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-17 12:15:59,961] [INFO] [744222] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Starting task. [2024-04-17 12:16:00,188] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa [2024-04-17 12:16:00,189] [DEBUG] ALLOCATOR: No config provided. Generating from payload [2024-04-17 12:16:00,189] [DEBUG] ALLOCATOR: Generating new key pair [2024-04-17 12:16:00,192] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa [2024-04-17 12:16:00,193] [DEBUG] ALLOCATOR: No config provided. Generating from payload [2024-04-17 12:16:00,193] [DEBUG] ALLOCATOR: Generating new key pair [2024-04-17 12:16:00,214] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa [2024-04-17 12:16:00,214] [DEBUG] ALLOCATOR: No config provided. Generating from payload [2024-04-17 12:16:00,214] [DEBUG] ALLOCATOR: Generating new key pair [2024-04-17 12:16:03,600] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance. [2024-04-17 12:16:03,600] [INFO] ALLOCATOR: Instance VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3 created. [2024-04-17 12:16:03,601] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance. [2024-04-17 12:16:03,602] [INFO] ALLOCATOR: Instance VAGRANT-2F9CB804-CC23-4972-BADF-4DC8A2403068 created. [2024-04-17 12:16:03,902] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance. [2024-04-17 12:16:03,902] [INFO] ALLOCATOR: Instance VAGRANT-2504FAA1-F821-46EE-B8A9-86E283308F39 created. [2024-04-17 12:16:59,745] [INFO] ALLOCATOR: Instance VAGRANT-2504FAA1-F821-46EE-B8A9-86E283308F39 started. [2024-04-17 12:17:02,008] [INFO] ALLOCATOR: Inventory file generated at /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml [2024-04-17 12:17:02,887] [INFO] ALLOCATOR: Instance VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3 started. [2024-04-17 12:17:04,281] [INFO] ALLOCATOR: Track file generated at /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/track.yaml [2024-04-17 12:17:04,330] [INFO] [744222] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 64.37 seconds. [2024-04-17 12:17:04,386] [INFO] ALLOCATOR: Instance VAGRANT-2F9CB804-CC23-4972-BADF-4DC8A2403068 started. [2024-04-17 12:17:05,148] [INFO] ALLOCATOR: Inventory file generated at /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml [2024-04-17 12:17:06,746] [INFO] ALLOCATOR: Inventory file generated at /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml [2024-04-17 12:17:07,390] [INFO] ALLOCATOR: Track file generated at /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/track.yaml [2024-04-17 12:17:07,446] [INFO] [744222] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Finished task in 67.49 seconds. [2024-04-17 12:17:09,054] [INFO] ALLOCATOR: Track file generated at /tmp/dtt1-poc/manager-linux-oracle-9-amd64/track.yaml [2024-04-17 12:17:09,101] [INFO] [744222] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Finished task in 69.13 seconds. [2024-04-17 12:17:09,116] [INFO] [744222] [ThreadPoolExecutor-0_1] [workflow_engine]: [run-manager-tests] Starting task. [2024-04-17 12:17:09,374] [INFO] TESTER: Running tests for 192.168.57.2 [2024-04-17 12:17:09,375] [INFO] TESTER: Running tests for 192.168.57.2 [2024-04-17 12:17:09,375] [INFO] TESTER: Running tests for 192.168.57.2 [2024-04-17 12:17:09,375] [DEBUG] TESTER: Using extra vars: {'component': 'manager', 'wazuh_version': '4.7.3', 'wazuh_revision': '40714', 'wazuh_branch': None, 'working_dir': '/tmp/tests', 'live': True, 'hosts_ip': ['192.168.57.2', '192.168.57.2', '192.168.57.2'], 'targets': '{wazuh-1: /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml, wazuh-2: /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml, wazuh-3: /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml}', 'dependencies': '{}', 'local_host_path': '/home/marcelo/wazuh/wazuh-qa/deployability', 'current_user': 'marcelo'} [2024-04-17 12:17:09,377] [DEBUG] TESTER: Rendering template /home/marcelo/wazuh/wazuh-qa/deployability/modules/testing/playbooks/setup.yml [2024-04-17 12:17:09,385] [DEBUG] TESTER: Using inventory: {'all': {'hosts': {'192.168.57.2': {'ansible_port': 22, 'ansible_user': 'vagrant', 'ansible_ssh_private_key_file': '/tmp/wazuh-qa/VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3/instance_key'}}}} [2024-04-17 12:17:09,385] [DEBUG] TESTER: Running playbook: [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Cleaning old key ssh-keygen registries', 'ansible.builtin.command': {'cmd': "ssh-keygen -f /home/marcelo/.ssh/known_hosts -R ''"}, 'loop': ['192.168.57.2', '192.168.57.2', '192.168.57.2']}]}] [2024-04-17 12:17:10,984] [DEBUG] TESTER: Playbook [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Cleaning old key ssh-keygen registries', 'ansible.builtin.command': {'cmd': "ssh-keygen -f /home/marcelo/.ssh/known_hosts -R ''"}, 'loop': ['192.168.57.2', '192.168.57.2', '192.168.57.2']}]}] finished with status {'skipped': {}, 'ok': {'localhost': 2}, 'dark': {}, 'failures': {}, 'ignored': {}, 'rescued': {}, 'processed': {'localhost': 1}, 'changed': {'localhost': 1}} [2024-04-17 12:17:10,986] [DEBUG] TESTER: Rendering template /home/marcelo/wazuh/wazuh-qa/deployability/modules/testing/playbooks/test.yml [2024-04-17 12:17:10,987] [DEBUG] TESTER: Using inventory: {'all': {'hosts': {'192.168.57.2': {'ansible_port': 22, 'ansible_user': 'vagrant', 'ansible_ssh_private_key_file': '/tmp/wazuh-qa/VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3/instance_key'}}}} [2024-04-17 12:17:10,987] [DEBUG] TESTER: Running playbook: [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Test install for manager', 'command': "python3 -m pytest modules/testing/tests/test_manager/test_install.py -v --wazuh_version=4.7.3 --wazuh_revision=40714 --component=manager --dependencies='{}' --targets='{wazuh-1: /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml, wazuh-2: /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml, wazuh-3: /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml}' --live=True -s", 'args': {'chdir': '/home/marcelo/wazuh/wazuh-qa/deployability'}}]}] [2024-04-17 12:17:12,591] [INFO] TESTER: Checking connection to ubuntu-20.04 [2024-04-17 12:17:12,853] [ERROR] TESTER: Authentication error. Check SSH credentials in ubuntu-20.04 anager-linux-debian-10-amd64/inventory.yaml" # - wazuh-11: "{working-dir}/manager-linux-debian-11-amd64/inventory.yaml" # - wazuh-12: "{working-dir}/manager-linux-debian-12-amd64/inventory.yaml" - tests: "install,restart,stop,uninstall" - component: "manager" - wazuh-version: "4.7.3" - wazuh-revision: "40714" - live: "True" depends-on: - "allocate-manager-linux-ubuntu-20.04-amd64" - "allocate-manager-linux-ubuntu-22.04-amd64" - "allocate-manager-linux-oracle-9-amd64" ```

the testing module logs that the instances have the same IP address, which causes the ssh login to fail.

[2024-04-17 12:17:09,374] [INFO] TESTER: Running tests for 192.168.57.2
[2024-04-17 12:17:09,375] [INFO] TESTER: Running tests for 192.168.57.2
[2024-04-17 12:17:09,375] [INFO] TESTER: Running tests for 192.168.57.2
workflow.log ``` marcelo@marcelo-B460-AORUS-PRO-AC:~/wazuh/wazuh-qa$ tail -f /tmp/workflow.log [2024-04-17 12:15:59,960] [INFO] [744222] [MainThread] [workflow_engine]: Executing DAG tasks. [2024-04-17 12:15:59,960] [INFO] [744222] [MainThread] [workflow_engine]: Executing tasks in parallel. [2024-04-17 12:15:59,960] [INFO] [744222] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Starting task. [2024-04-17 12:15:59,961] [INFO] [744222] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task. [2024-04-17 12:15:59,961] [INFO] [744222] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Starting task. [2024-04-17 12:16:00,188] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa [2024-04-17 12:16:00,189] [DEBUG] ALLOCATOR: No config provided. Generating from payload [2024-04-17 12:16:00,189] [DEBUG] ALLOCATOR: Generating new key pair [2024-04-17 12:16:00,192] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa [2024-04-17 12:16:00,193] [DEBUG] ALLOCATOR: No config provided. Generating from payload [2024-04-17 12:16:00,193] [DEBUG] ALLOCATOR: Generating new key pair [2024-04-17 12:16:00,214] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa [2024-04-17 12:16:00,214] [DEBUG] ALLOCATOR: No config provided. Generating from payload [2024-04-17 12:16:00,214] [DEBUG] ALLOCATOR: Generating new key pair [2024-04-17 12:16:03,600] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance. [2024-04-17 12:16:03,600] [INFO] ALLOCATOR: Instance VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3 created. [2024-04-17 12:16:03,601] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance. [2024-04-17 12:16:03,602] [INFO] ALLOCATOR: Instance VAGRANT-2F9CB804-CC23-4972-BADF-4DC8A2403068 created. [2024-04-17 12:16:03,902] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance. [2024-04-17 12:16:03,902] [INFO] ALLOCATOR: Instance VAGRANT-2504FAA1-F821-46EE-B8A9-86E283308F39 created. [2024-04-17 12:16:59,745] [INFO] ALLOCATOR: Instance VAGRANT-2504FAA1-F821-46EE-B8A9-86E283308F39 started. [2024-04-17 12:17:02,008] [INFO] ALLOCATOR: Inventory file generated at /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml [2024-04-17 12:17:02,887] [INFO] ALLOCATOR: Instance VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3 started. [2024-04-17 12:17:04,281] [INFO] ALLOCATOR: Track file generated at /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/track.yaml [2024-04-17 12:17:04,330] [INFO] [744222] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 64.37 seconds. [2024-04-17 12:17:04,386] [INFO] ALLOCATOR: Instance VAGRANT-2F9CB804-CC23-4972-BADF-4DC8A2403068 started. [2024-04-17 12:17:05,148] [INFO] ALLOCATOR: Inventory file generated at /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml [2024-04-17 12:17:06,746] [INFO] ALLOCATOR: Inventory file generated at /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml [2024-04-17 12:17:07,390] [INFO] ALLOCATOR: Track file generated at /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/track.yaml [2024-04-17 12:17:07,446] [INFO] [744222] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Finished task in 67.49 seconds. [2024-04-17 12:17:09,054] [INFO] ALLOCATOR: Track file generated at /tmp/dtt1-poc/manager-linux-oracle-9-amd64/track.yaml [2024-04-17 12:17:09,101] [INFO] [744222] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Finished task in 69.13 seconds. [2024-04-17 12:17:09,116] [INFO] [744222] [ThreadPoolExecutor-0_1] [workflow_engine]: [run-manager-tests] Starting task. [2024-04-17 12:17:09,374] [INFO] TESTER: Running tests for 192.168.57.2 [2024-04-17 12:17:09,375] [INFO] TESTER: Running tests for 192.168.57.2 [2024-04-17 12:17:09,375] [INFO] TESTER: Running tests for 192.168.57.2 [2024-04-17 12:17:09,375] [DEBUG] TESTER: Using extra vars: {'component': 'manager', 'wazuh_version': '4.7.3', 'wazuh_revision': '40714', 'wazuh_branch': None, 'working_dir': '/tmp/tests', 'live': True, 'hosts_ip': ['192.168.57.2', '192.168.57.2', '192.168.57.2'], 'targets': '{wazuh-1: /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml, wazuh-2: /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml, wazuh-3: /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml}', 'dependencies': '{}', 'local_host_path': '/home/marcelo/wazuh/wazuh-qa/deployability', 'current_user': 'marcelo'} [2024-04-17 12:17:09,377] [DEBUG] TESTER: Rendering template /home/marcelo/wazuh/wazuh-qa/deployability/modules/testing/playbooks/setup.yml [2024-04-17 12:17:09,385] [DEBUG] TESTER: Using inventory: {'all': {'hosts': {'192.168.57.2': {'ansible_port': 22, 'ansible_user': 'vagrant', 'ansible_ssh_private_key_file': '/tmp/wazuh-qa/VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3/instance_key'}}}} [2024-04-17 12:17:09,385] [DEBUG] TESTER: Running playbook: [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Cleaning old key ssh-keygen registries', 'ansible.builtin.command': {'cmd': "ssh-keygen -f /home/marcelo/.ssh/known_hosts -R ''"}, 'loop': ['192.168.57.2', '192.168.57.2', '192.168.57.2']}]}] [2024-04-17 12:17:10,984] [DEBUG] TESTER: Playbook [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Cleaning old key ssh-keygen registries', 'ansible.builtin.command': {'cmd': "ssh-keygen -f /home/marcelo/.ssh/known_hosts -R ''"}, 'loop': ['192.168.57.2', '192.168.57.2', '192.168.57.2']}]}] finished with status {'skipped': {}, 'ok': {'localhost': 2}, 'dark': {}, 'failures': {}, 'ignored': {}, 'rescued': {}, 'processed': {'localhost': 1}, 'changed': {'localhost': 1}} [2024-04-17 12:17:10,986] [DEBUG] TESTER: Rendering template /home/marcelo/wazuh/wazuh-qa/deployability/modules/testing/playbooks/test.yml [2024-04-17 12:17:10,987] [DEBUG] TESTER: Using inventory: {'all': {'hosts': {'192.168.57.2': {'ansible_port': 22, 'ansible_user': 'vagrant', 'ansible_ssh_private_key_file': '/tmp/wazuh-qa/VAGRANT-F72E5B3B-11D7-46DE-A8B3-8557E9D5F6F3/instance_key'}}}} [2024-04-17 12:17:10,987] [DEBUG] TESTER: Running playbook: [{'hosts': 'localhost', 'become': True, 'become_user': 'marcelo', 'tasks': [{'name': 'Test install for manager', 'command': "python3 -m pytest modules/testing/tests/test_manager/test_install.py -v --wazuh_version=4.7.3 --wazuh_revision=40714 --component=manager --dependencies='{}' --targets='{wazuh-1: /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml, wazuh-2: /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml, wazuh-3: /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml}' --live=True -s", 'args': {'chdir': '/home/marcelo/wazuh/wazuh-qa/deployability'}}]}] [2024-04-17 12:17:12,591] [INFO] TESTER: Checking connection to ubuntu-20.04 [2024-04-17 12:17:12,853] [ERROR] TESTER: Authentication error. Check SSH credentials in ubuntu-20.04 ```

These are the inventory files generated by the allocator:

inventory.zip

All of them have the same IP address.

c-bordon commented 2 weeks ago

This error only affects local deployments with Vagrant, since IP availability validation is done for this type of deployment. I understand that this was originally done to manage which IPs are assigned to the local machine and then configure it in the inventory.yml file to be able to connect to the VM.

The problem occurs because currently, the Allocator module handles the deployment of individual machines, as it runs in 3 threads, each execution validates the availability of the first IP 192.168.57.2, and when it responds that the IP is free, it proceeds to try to configure it.

The fix that occurs to me for this is to directly remove this method and configure the Vagrant VM network in DHCP, so that Vagrant and VirtualBox are in charge of configuring the IP and removing this responsibility from the Allocator. Then we can get the IP of the VM to configure it in the inventory.yml file

c-bordon commented 1 week ago

Update report

I was testing changing the approach as we had thought, letting Vagrant and Virtualbox take care of configuring the IP, but this is causing some problems, we need to obtain this IP to configure our inventory.yml. To obtain the IP, the vagrant ssh-config command is not useful, because in the cases of vagrant and virtualbox the HostName: is equal to 127.0.0.1. Therefore, to obtain the IP that belongs to the local network, we must execute a command through vagrant ssh, but for some reason, when trying to perform this query, it asks us to enter a password, which should not happen since it is configured with a password. private:

cbordon@cbordon-MS-7C88:~/Documents/wazuh/repositorios/wazuh-qa$ python3 deployability/modules/allocation/main.py --provider vagrant --size micro --composite-name linux-ubuntu-20.04-amd64
[2024-04-22 14:40:08] [INFO] ALLOCATOR: Creating instance at /tmp/wazuh-qa
[2024-04-22 14:40:08] [DEBUG] ALLOCATOR: No config provided. Generating from payload
[2024-04-22 14:40:08] [DEBUG] ALLOCATOR: Generating new key pair
[2024-04-22 14:40:09] [DEBUG] ALLOCATOR: Vagrantfile created. Creating instance.
[2024-04-22 14:40:09] [INFO] ALLOCATOR: Instance VAGRANT-74FAB66C-614B-430D-AC0C-8047C9229FE9 created.
[2024-04-22 14:40:50] [INFO] ALLOCATOR: Instance VAGRANT-74FAB66C-614B-430D-AC0C-8047C9229FE9 started.
vagrant@127.0.0.1's password: 
vagrant@127.0.0.1's password: 

I was looking for some alternatives but I can't find the solution, this happens with different boxes

c-bordon commented 1 week ago

Update report

After validating and testing different alternatives, all of them present a possible solution but various problems, I will try to clarify here all the possible solutions that were addressed:

Control file

We were trying to solve this with a control file, the idea is to create a file and record the busy IPs, and have the threads block the reading and writing of this file to prevent the same IP from being assigned to two or more machines. The problem we find with this approach is where to keep this control file, since the working directory is configurable and variable, the VM directory is not valid since it maintains the VM's information, the option that is closest to What we need is the same directory where the module is located, the problem with this is that this does not prevent the module from being found in different directories and modules being executed from different directories, therefore, we also lose control. In turn, the execution threads are not the allocator's own, but rather the workflow engine's, which prevents the use of the thread's block in Python.

The same approach that we use in macStadium

At first we thought that the solution was simple, let Vagrant and Virtualbox take care of assigning the IP, and then obtain it with the vagrant ssh-config command, but we found that in this case, the IP that this command brings is 127.0.0.1, so it is of no use to us:

cbordon@cbordon-MS-7C88:/tmp/wazuh-qa/VAGRANT-A3AEAF0B-8EDA-4B11-8330-E553AEC34823$ vagrant ssh-config
Host default
  HostName 127.0.0.1
  User vagrant
  Port 22
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /tmp/wazuh-qa/VAGRANT-A3AEAF0B-8EDA-4B11-8330-E553AEC34823/instance_key
  IdentitiesOnly yes
  LogLevel FATAL
  ForwardAgent yes

Upon detecting this, we tried to obtain the IP of the private network by accessing the VM, first, we tried using vagrant ssh without success. Another alternative was to execute an SSH query using the IP 127.0.0.1 and the port exposed on the host by Vagrant, with this, we accessed through SSH and obtained the network information. But this has the complexity that not all boxes have an ip address, or ifconfig, that is, you have to make specific queries according to the box, we also have the complexity that we will not always find the same interface name network, so searching for the private IP is not so simple. Searching for IP 192.168.xxx.xxx is not a guarantee either since some users may have a private network other than this (custom). Therefore, this alternative also has several difficulties.

ssh -o 'StrictHostKeyChecking no' -i /tmp/wazuh-qa/VAGRANT-D0EE6388-19D0-45C9-94D3-E53A23CAD916/instance_key -p 2201 vagrant@127.0.0.1 ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 02:60:50:11:b6:59 brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
       valid_lft 86393sec preferred_lft 86393sec
    inet6 fe80::60:50ff:fe11:b659/64 scope link 
       valid_lft forever preferred_lft forever
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 08:00:27:60:e3:39 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.40/24 brd 192.168.56.255 scope global dynamic enp0s8
       valid_lft 593sec preferred_lft 593sec
    inet6 fe80::a00:27ff:fe60:e339/64 scope link 
       valid_lft forever preferred_lft forever

Random IP

After discussing it with @fcaffieri and @teddytpc1 and understanding that this problem only occurs in local deployments with Vagrant, where due to the Hardware resources that each one normally manages in their Workstations, the possibility of deploying multiple virtual machines on the machine itself It is usually not possible (generally with a workstation with 32 GB of RAM you cannot create more than 5 VMs) we believe that random assignment of the last octet of the IP address can be a solution, although it greatly reduces the possibility of coincidences. In the assignment of IPs, the possibility still exists. But we understand that it is the solution that best approaches what we need, considering that we are not going to build many machines locally and that we have 254 allocation possibilities.

c-bordon commented 1 week ago

The test is performed using this yaml:

workflow.yaml ```console version: 0.1 description: This workflow is used to test manager deployment for DDT1 PoC variables: manager-os: - linux-ubuntu-20.04-amd64 - linux-ubuntu-22.04-amd64 - linux-oracle-9-amd64 # - linux-amazon-2-amd64 # - linux-redhat-7-amd64 # - linux-redhat-8-amd64 # - linux-redhat-9-amd64 # - linux-centos-7-amd64 # - linux-centos-8-amd64 # - linux-debian-10-amd64 # - linux-debian-11-amd64 # - linux-debian-12-amd64 infra-provider: vagrant working-dir: /tmp/dtt1-poc tasks: # Unique manager allocate task - task: "allocate-manager-{manager}" description: "Allocate resources for the manager." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: large - composite-name: "{manager}" - inventory-output: "{working-dir}/manager-{manager}/inventory.yaml" - track-output: "{working-dir}/manager-{manager}/track.yaml" on-error: "abort-all" foreach: - variable: manager-os as: manager # Generic manager test task - task: "run-manager-tests" description: "Run tests install for the manager." do: this: process with: path: python3 args: - modules/testing/main.py - targets: - wazuh-1: "{working-dir}/manager-linux-ubuntu-20.04-amd64/inventory.yaml" - wazuh-2: "{working-dir}/manager-linux-ubuntu-22.04-amd64/inventory.yaml" - wazuh-3: "{working-dir}/manager-linux-oracle-9-amd64/inventory.yaml" # - wazuh-4: "{working-dir}/manager-linux-centos-7-amd64/inventory.yaml" # - wazuh-5: "{working-dir}/manager-linux-amazon-2-amd64/inventory.yaml" # - wazuh-6: "{working-dir}/manager-linux-redhat-7-amd64/inventory.yaml" # - wazuh-7: "{working-dir}/manager-linux-redhat-8-amd64/inventory.yaml" # - wazuh-8: "{working-dir}/manager-linux-redhat-9-amd64/inventory.yaml" # - wazuh-9: "{working-dir}/manager-linux-centos-8-amd64/inventory.yaml" # - wazuh-10: "{working-dir}/manager-linux-debian-10-amd64/inventory.yaml" # - wazuh-11: "{working-dir}/manager-linux-debian-11-amd64/inventory.yaml" # - wazuh-12: "{working-dir}/manager-linux-debian-12-amd64/inventory.yaml" - tests: "install,restart,stop,uninstall" - component: "manager" - wazuh-version: "4.7.3" - wazuh-revision: "40714" - live: "True" depends-on: - "allocate-manager-linux-ubuntu-20.04-amd64" - "allocate-manager-linux-ubuntu-22.04-amd64" - "allocate-manager-linux-oracle-9-amd64" ```

Result:

cbordon@cbordon-MS-7C88:~/Documents/wazuh/repositorios/wazuh-qa/deployability$ python3 modules/workflow_engine/__main__.py modules/workflow_engine/examples/testing_threats.yaml --threads 3
[2024-04-23 16:14:28] [INFO] [615750] [MainThread] [workflow_engine]: Executing DAG tasks.
[2024-04-23 16:14:28] [INFO] [615750] [MainThread] [workflow_engine]: Executing tasks in parallel.
[2024-04-23 16:14:28] [INFO] [615750] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Starting task.
[2024-04-23 16:14:28] [INFO] [615750] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task.
[2024-04-23 16:14:28] [INFO] [615750] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Starting task.
[2024-04-23 16:15:29] [INFO] [615750] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Finished task in 61.48 seconds.
[2024-04-23 16:15:33] [INFO] [615750] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 64.94 seconds.
[2024-04-23 16:15:39] [INFO] [615750] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Finished task in 71.33 seconds.
[2024-04-23 16:15:39] [INFO] [615750] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-manager-tests] Starting task.
[2024-04-23 16:15:57] [INFO] [615750] [ThreadPoolExecutor-0_0] [workflow_engine]: [run-manager-tests] Finished task in 17.64 seconds.
[2024-04-23 16:15:57] [INFO] [615750] [MainThread] [workflow_engine]: Executing Reverse DAG tasks.
[2024-04-23 16:15:57] [INFO] [615750] [MainThread] [workflow_engine]: Executing tasks in parallel.

Inventories:

cbordon@cbordon-MS-7C88:~$ cat /tmp/dtt1-poc/manager-linux-oracle-9-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.230
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-0EA9AC84-2CF4-483C-991F-A2947209675B/instance_key
ansible_user: vagrant
cbordon@cbordon-MS-7C88:~$ cat /tmp/dtt1-poc/manager-linux-ubuntu-22.04-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.244
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-1AA3392D-9D8B-42AD-83F8-1FE5024A552E/instance_key
ansible_user: vagrant
cbordon@cbordon-MS-7C88:~$ cat /tmp/dtt1-poc/manager-linux-ubuntu-20.04-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.59
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-BB327E15-E57A-4B74-B926-2B3A8E611BE3/instance_key
ansible_user: vagrant
c-bordon commented 1 week ago

New test with 5 threads:

workflow.yaml ```console version: 0.1 description: This workflow is used to test manager deployment for DDT1 PoC variables: manager-os: - linux-ubuntu-20.04-amd64 - linux-ubuntu-22.04-amd64 - linux-oracle-9-amd64 - linux-amazon-2-amd64 - linux-redhat-7-amd64 # - linux-redhat-8-amd64 # - linux-redhat-9-amd64 # - linux-centos-7-amd64 # - linux-centos-8-amd64 # - linux-debian-10-amd64 # - linux-debian-11-amd64 # - linux-debian-12-amd64 infra-provider: vagrant working-dir: /tmp/dtt1-poc tasks: # Unique manager allocate task - task: "allocate-manager-{manager}" description: "Allocate resources for the manager." do: this: process with: path: python3 args: - modules/allocation/main.py - action: create - provider: "{infra-provider}" - size: large - composite-name: "{manager}" - inventory-output: "{working-dir}/manager-{manager}/inventory.yaml" - track-output: "{working-dir}/manager-{manager}/track.yaml" on-error: "abort-all" foreach: - variable: manager-os as: manager # Generic manager test task - task: "run-manager-tests" description: "Run tests install for the manager." do: this: process with: path: python3 args: - modules/testing/main.py - targets: - wazuh-1: "{working-dir}/manager-linux-ubuntu-20.04-amd64/inventory.yaml" - wazuh-2: "{working-dir}/manager-linux-ubuntu-22.04-amd64/inventory.yaml" - wazuh-3: "{working-dir}/manager-linux-oracle-9-amd64/inventory.yaml" - wazuh-4: "{working-dir}/manager-linux-amazon-2-amd64/inventory.yaml" - wazuh-5: "{working-dir}/manager-linux-redhat-7-amd64/inventory.yaml" # - wazuh-6: "{working-dir}/manager-linux-redhat-7-amd64/inventory.yaml" # - wazuh-7: "{working-dir}/manager-linux-redhat-8-amd64/inventory.yaml" # - wazuh-8: "{working-dir}/manager-linux-redhat-9-amd64/inventory.yaml" # - wazuh-9: "{working-dir}/manager-linux-centos-8-amd64/inventory.yaml" # - wazuh-10: "{working-dir}/manager-linux-debian-10-amd64/inventory.yaml" # - wazuh-11: "{working-dir}/manager-linux-debian-11-amd64/inventory.yaml" # - wazuh-12: "{working-dir}/manager-linux-debian-12-amd64/inventory.yaml" - tests: "install,restart,stop,uninstall" - component: "manager" - wazuh-version: "4.7.3" - wazuh-revision: "40714" - live: "True" depends-on: - "allocate-manager-linux-ubuntu-20.04-amd64" - "allocate-manager-linux-ubuntu-22.04-amd64" - "allocate-manager-linux-oracle-9-amd64" ```
cbordon@cbordon-MS-7C88:~/Documents/wazuh/repositorios/wazuh-qa/deployability$ python3 modules/workflow_engine/__main__.py modules/workflow_engine/examples/testing_threats.yaml --threads 5
[2024-04-23 16:31:25] [INFO] [667823] [MainThread] [workflow_engine]: Executing DAG tasks.
[2024-04-23 16:31:25] [INFO] [667823] [MainThread] [workflow_engine]: Executing tasks in parallel.
[2024-04-23 16:31:25] [INFO] [667823] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Starting task.
[2024-04-23 16:31:25] [INFO] [667823] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Starting task.
[2024-04-23 16:31:25] [INFO] [667823] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Starting task.
[2024-04-23 16:31:25] [INFO] [667823] [ThreadPoolExecutor-0_3] [workflow_engine]: [allocate-manager-linux-amazon-2-amd64] Starting task.
[2024-04-23 16:31:25] [INFO] [667823] [ThreadPoolExecutor-0_4] [workflow_engine]: [allocate-manager-linux-redhat-7-amd64] Starting task.
[2024-04-23 16:32:35] [INFO] [667823] [ThreadPoolExecutor-0_4] [workflow_engine]: [allocate-manager-linux-redhat-7-amd64] Finished task in 69.69 seconds.
[2024-04-23 16:32:39] [INFO] [667823] [ThreadPoolExecutor-0_1] [workflow_engine]: [allocate-manager-linux-ubuntu-22.04-amd64] Finished task in 73.99 seconds.
[2024-04-23 16:32:40] [INFO] [667823] [ThreadPoolExecutor-0_0] [workflow_engine]: [allocate-manager-linux-ubuntu-20.04-amd64] Finished task in 74.78 seconds.
[2024-04-23 16:32:44] [INFO] [667823] [ThreadPoolExecutor-0_2] [workflow_engine]: [allocate-manager-linux-oracle-9-amd64] Finished task in 78.56 seconds.
[2024-04-23 16:32:44] [INFO] [667823] [ThreadPoolExecutor-0_4] [workflow_engine]: [run-manager-tests] Starting task.
[2024-04-23 16:32:44] [ERROR] [667823] [ThreadPoolExecutor-0_4] [workflow_engine]: [run-manager-tests] Task failed with error: Error executing process task Traceback (most recent call last):
  File "/home/cbordon/Documents/wazuh/repositorios/wazuh-qa/deployability/modules/testing/main.py", line 30, in <module>
    Tester.run(InputPayload(**vars(parse_arguments())))
  File "/home/cbordon/Documents/wazuh/repositorios/wazuh-qa/deployability/modules/testing/testing.py", line 40, in run
    inventory = Inventory(**Utils.load_from_yaml(', '.join(dictionary.values())))
  File "/home/cbordon/Documents/wazuh/repositorios/wazuh-qa/deployability/modules/generic/utils.py", line 52, in load_from_yaml
    raise FileNotFoundError(f'File "{file_path}" not found.')
FileNotFoundError: File "/tmp/dtt1-poc/manager-linux-amazon-2-amd64/inventory.yaml" not found.
.
[2024-04-23 16:33:07] [INFO] [667823] [ThreadPoolExecutor-0_3] [workflow_engine]: [allocate-manager-linux-amazon-2-amd64] Finished task in 101.82 seconds.
[2024-04-23 16:33:07] [INFO] [667823] [MainThread] [workflow_engine]: Executing Reverse DAG tasks.
[2024-04-23 16:33:07] [INFO] [667823] [MainThread] [workflow_engine]: Executing tasks in parallel.

The error apparently occurs because the test is attempted to be executed before the machine is finished provisioning, since the provisioning is done correctly.

Inventories.yml

cbordon@cbordon-MS-7C88:/tmp/dtt1-poc$ cat manager-linux-amazon-2-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.129
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-714825EF-42CD-4B8A-BC13-147FFED5EA15/instance_key
ansible_user: vagrant
cbordon@cbordon-MS-7C88:/tmp/dtt1-poc$ cat manager-linux-oracle-9-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.39
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-3EC1B0AF-5540-451B-AC8B-63B1B95A6021/instance_key
ansible_user: vagrant
cbordon@cbordon-MS-7C88:/tmp/dtt1-poc$ cat manager-linux-redhat-7-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.72
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-017DABF6-327F-43BA-83E8-90AD75FC8CDD/instance_key
ansible_user: vagrant
cbordon@cbordon-MS-7C88:/tmp/dtt1-poc$ cat manager-linux-ubuntu-20.04-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.137
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-FFB50910-A1AA-4F12-BC47-FC6A03E5CC8B/instance_key
ansible_user: vagrant
cbordon@cbordon-MS-7C88:/tmp/dtt1-poc$ cat manager-linux-ubuntu-22.04-amd64/inventory.yaml 
ansible_connection: ssh
ansible_host: 192.168.57.64
ansible_port: 22
ansible_ssh_common_args: -o StrictHostKeyChecking=no
ansible_ssh_private_key_file: /tmp/wazuh-qa/VAGRANT-52293A82-347F-4C19-9EB3-2F6A010BE703/instance_key
ansible_user: vagrant
fcaffieri commented 1 week ago

LGTM