wazuh / wazuh-virtual-machines

Wazuh - Virtual Machines (OVA and AMI)
https://wazuh.com/
GNU General Public License v2.0
0 stars 0 forks source link

Migrate the `Packages_Builder_OVA` pipeline to GHA #20

Closed teddytpc1 closed 1 month ago

teddytpc1 commented 2 months ago
Objective
https://github.com/wazuh/wazuh-packages/issues/2904

Description

Because of the Wazuh packages redesign tier 2 objective we need to migrate the Packages_Builder_OVA pipeline from Jenkins to GHA.

Tasks

Related

c-bordon commented 2 months ago

Update report

I've been working on setting up OIDC for the interaction between AWS and GitHub following this documentation:

https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services https://github.com/aws-actions/configure-aws-credentials#configure-aws-credentials-for-github-actions

I have created the role oidc-wazuh-virtual-machines-repository in wazuh-qa account, at the moment without permissions.

The initial process involves provisioning an instance in AWS with a special AMI (ami-0d4bd55523ee67aa4) for the OVA; you will likely need to add this AMI to the allocator to deploy that instance with the Allocator module.

I also asked if the agent team could share the workflow they built with us, since the development they carried out could be reused to a large extent to speed up the times.

c-bordon commented 2 months ago

Update report

I've been working on migrating the process and I found that to provision the base AMI with the allocator we had to make several changes since the base AMI has changes that affect the operation of the allocator, that is, the user data cannot be executed, which means that many changes must be made to the allocator for this specific AMI.

For this reason, I think it would be best to start with the Amazon Linux 2 base AMI and have the playbook take care of all the configuration.

For this, the changes that come from this issue https://github.com/wazuh/wazuh-packages/issues/1575 and this one https://github.com/wazuh/wazuh-packages/issues/2744 should be taken so that the workflow is in charge of making all the necessary configurations so that the OVA can be built as it was done before.

c-bordon commented 1 month ago

Update report

I am working on the workflow with the necessary adaptations for it to run.

In this case, I had to start from the Amazon Linux 2 AMI that we have in the allocator and make all the changes in said AMI, for this I created a new playbook which is in charge of customizing users and ssh connection for the VM started and with this obtain an AMI similar to the custom AMI that was used previously.

I had to do different tests to test the workflow since I was encountering different problems.

I already have the workflow working although I have to continue working on the playbooks to validate that the customization I am doing works correctly.

c-bordon commented 1 month ago

Update report

Today I continued working on solving some bugs found and I was able to finish the playbook necessary to install the core components of Wazuh, in order to do the test I had to use the wazuh-package repository and the 4.9.0 packages.

ansible-playbook .github/workflows/ansible_playbooks/ova_generator.yaml -i /tmp/allocatorvm_ova/inventory --extra-vars "wia_branch=4.9.0 ova_branch=4.10.0 repository=dev  builder_args='-i -d' debug=yes"

PLAY [all] *********************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************************************************
[WARNING]: Platform linux on host ec2-54-164-70-91.compute-1.amazonaws.com is using the discovered Python interpreter at /usr/bin/python3.7, but future installation of another Python interpreter could change the meaning of that path.
See https://docs.ansible.com/ansible-core/2.12/reference_appendices/interpreter_discovery.html for more information.
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Install git] *************************************************************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Make build directory] ****************************************************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Copy ova directory] ******************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Download the Wazuh installation assistant repository] ********************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Set custom hostname] *****************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Build Wazuh installation assistant script] *******************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Copy Wazuh installation assistant script to tmp dir] *********************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Run provision script] ****************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Clean provision files] ***************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Clean Wazuh installation assistant resources] ****************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Clean logs] **************************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Clean history] ***********************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Clean YUM cache] *********************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Remove ec2-user] *********************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Remove ec2-user group] ***************************************************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Remove ec2-instance-connect] *********************************************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Remove amazon-ssm-agent] *************************************************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Change ssh port to 22] ***************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Remove AuthorizedKeysCommand from sshd_config] ***************************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Remove AuthorizedKeysCommandUser from sshd_config] ***********************************************************************************************************************************************************************************
ok: [ec2-54-164-70-91.compute-1.amazonaws.com]

TASK [Restart SSH service] *****************************************************************************************************************************************************************************************************************
changed: [ec2-54-164-70-91.compute-1.amazonaws.com]

PLAY RECAP *********************************************************************************************************************************************************************************************************************************
ec2-54-164-70-91.compute-1.amazonaws.com : ok=22   changed=14   unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

The playbook worked correctly, now I'm working on the process of migrating the instance to convert it to OVA format

c-bordon commented 1 month ago

Update report

I have already migrated the code from the Jenkins pipeline to the GitHub Workflow, I am in the testing process, although at the moment I am not able to move forward, the installation wizard is failing in the provisioning. This is what you see in the logs:

Running transaction test Transaction test succeeded Running transaction Installing : wazuh-manager-4.9.0-1.x86_64 1/1 Verifying : wazuh-manager-4.9.0-1.x86_64 1/1 Installed: wazuh-manager.x86_64 0:4.9.0-1 Complete!",
        "29/08/2024 19:51:08 DEBUG: Checking Wazuh installation.",
        "29/08/2024 19:51:08 DEBUG: There are Wazuh remaining files.",
        "29/08/2024 19:51:08 DEBUG: There are Wazuh indexer remaining files.",
        "29/08/2024 19:51:08 INFO: Wazuh manager installation finished.",
        "29/08/2024 19:51:08 DEBUG: Configuring Wazuh manager.",
        "29/08/2024 19:51:08 DEBUG: Setting provisional Wazuh indexer password.",
        "29/08/2024 19:51:08 INFO: Wazuh manager vulnerability detection configuration finished.",
        "29/08/2024 19:51:08 INFO: Starting service wazuh-manager.",
        "Created symlink from /etc/systemd/system/multi-user.target.wants/wazuh-manager.service to /usr/lib/systemd/system/wazuh-manager.service.",
        "29/08/2024 19:51:19 INFO: wazuh-manager service started.",
        "29/08/2024 19:51:19 INFO: Checking Wazuh API connection",
        "29/08/2024 19:51:19 ERROR: Wazuh API connection Error. ",
        "wazuh-clusterd not running...",
        "wazuh-modulesd is running...",
        "wazuh-monitord is running...",
        "wazuh-logcollector is running...",
        "wazuh-remoted is running...",
        "wazuh-syscheckd is running...",
        "wazuh-analysisd is running...",
        "wazuh-maild not running...",
        "wazuh-execd is running...",
        "wazuh-db is running...",
        "wazuh-authd is running...",
        "wazuh-agentlessd not running...",
        "wazuh-integratord not running...",
        "wazuh-dbd not running...",
        "wazuh-csyslogd not running...",
        "wazuh-apid is running...",
        "29/08/2024 19:51:20 INFO: --- Removing existing Wazuh installation ---",
        "29/08/2024 19:51:20 INFO: Removing Wazuh manager.",

I need to perform additional tests on the wizard, to verify that from the 4.9.0-testing branch Wazuh can be installed correctly.

c-bordon commented 1 month ago

Update report

The error mentioned was fixed in the installation wizard. Continuing with the tests, I found an issue that was difficult to resolve due to the time it took to execute the provision.sh script. Changes were made in the playbook to take into account a longer delay in executing that script.

I am now working on exporting the instance to S3, converting the instance into the expected format. I had to adjust permissions on the buckets to be able to perform the task.

c-bordon commented 1 month ago

Update report

I encountered an error caused by the duration time that the role had set, by default the token validity time is 1 hour, this workflow takes longer than that to complete, so this caused the workflow not to be able to continue. This time was extended to 3 hours by code and 4 hours in the role

c-bordon commented 1 month ago

Update report

I'm encountering an error when trying to standardize the OVA, I'm going to continue working on the problem


 + sed -i 's|ovf:capacity="40"|ovf:capacity="50"|g' ova/wazuh_ovf_template
/home/runner/work/_temp/7258f211-36ff-42dc-a481-f4d0fa3d7bab.sh: line 3: unexpected EOF while looking for matching `"'
c-bordon commented 1 month ago

Update report

I was working on the cloud-init configuration, since when starting the OVA the ec2-user user and the SSH configuration were modified. What I did to solve it is to disable cloud-init from the OVA and clean the /var/lib/cloud directory

I also encountered a problem when loading the Wazuh image at the OVA startup. The image does not load even though the configuration is correct.

I have to continue investigating this point

c-bordon commented 1 month ago

After analyzing several errors in grub and some error messages when starting the instance, it was detected that one of the possible differences is that the OVA normally starts from an AMI built from a local machine with vagrant. In contrast, the new one started from an AWS EC2 instance, these differences can cause the problem.

Screenshot_20240905_084217

Screenshot_20240905_084048

For this reason, it was decided to use the base AMI that was used in Jenkins again, for this the allocator is no longer used for the provisioning of this VM, and the aws cli is used directly.