IBM / zmodstack-deploy

IBM Z & Cloud Modernization Stack deployment tools
Apache License 2.0
4 stars 3 forks source link

OCP / Azure - Issues executing Ansible playbooks present in ocp mono repo #42

Closed midhun6989 closed 1 year ago

midhun6989 commented 1 year ago

Description

When the Ansible playbooks for zoscb, zosconnect and wazi-devspaces present in the ocp mono repo is executed in the Azure Bootstrap VM, it is failing with errors.

Expected Behaviour

Ansible playbooks for zoscb, zosconnect and wazi-devspaces should be successfully executed in the Azure Bootstrap VM.

Potential Solutions

  1. operator-group.yml doesn't exists. File name should be changed to operator-group.yaml in the below file. https://github.com/IBM/zmodstack-deploy/blob/dev/ocp/ansible/roles/wazi-devspaces/tasks/wazi-devspaces2.yaml#L15.
  2. ec2-user don't exist in Azure Bootstrap VM. Corrections should be made in the below files. https://github.com/IBM/zmodstack-deploy/blob/dev/ocp/ansible/roles/wazi-devspaces/vars/wazi-devspaces2.yml#L5 https://github.com/IBM/zmodstack-deploy/blob/dev/ocp/ansible/roles/wazi-devspaces/vars/wazi-devspaces3.yml#L5
midhun6989 commented 1 year ago

@nikita-hakari The afore mentioned issues are fixed now with your changes in ce-42 branch. Thank You.

Below are the further issues which I am encountering while executing the Ansible playbooks in the Azure Bootstrap VM. Please have a check.

  1. Error I am getting while executing zoscb playbook task in Azure Bootstrap VM.
    TASK [zoscb : Verify Z/OS CloudBroker instance: 'zoscloudbroker'] ****************************************************************************************************************************************************
    fatal: [localhost]: FAILED! => {"msg": "The conditional check ''ibm-zoscb-manager-zoscloudbroker' in broker_instance_info.resources[0].status.deployment.ready' failed. The error was: error while evaluating conditional ('ibm-zoscb-manager-zoscloudbroker' in broker_instance_info.resources[0].status.deployment.ready): 'dict object' has no attribute 'ready'"}

However, from the OpenShift UI, I could see the instance status is successful.

Screenshot 2023-09-15 at 1 41 22 PM
  1. Error I am getting while executing zosconnect playbook task in Azure Bootstrap VM.

    TASK [zosconnect : Retrieve Z/os connect subscription info] **********************************************************************************************************************************************************
    FAILED - RETRYING: Retrieve Z/os connect subscription info (30 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (29 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (28 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (27 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (26 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (25 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (24 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (23 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (22 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (21 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (20 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (19 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (18 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (17 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (16 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (15 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (14 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (13 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (12 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (11 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (10 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (9 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (8 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (7 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (6 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (5 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (4 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (3 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (2 retries left).
    FAILED - RETRYING: Retrieve Z/os connect subscription info (1 retries left).
    fatal: [localhost]: FAILED! => {"api_found": true, "attempts": 30, "changed": false, "resources": [{"apiVersion": "operators.coreos.com/v1alpha1", "kind": "Subscription", "metadata": {"creationTimestamp": "2023-09-12T13:04:14Z", "generation": 1, "labels": {"operators.coreos.com/ibm-zcon-zosconnect.openshift-operators": ""}, "managedFields": [{"apiVersion": "operators.coreos.com/v1alpha1", "fieldsType": "FieldsV1", "fieldsV1": {"f:spec": {".": {}, "f:channel": {}, "f:installPlanApproval": {}, "f:name": {}, "f:source": {}, "f:sourceNamespace": {}, "f:startingCSV": {}}}, "manager": "OpenAPI-Generator", "operation": "Update", "time": "2023-09-12T13:04:14Z"}, {"apiVersion": "operators.coreos.com/v1alpha1", "fieldsType": "FieldsV1", "fieldsV1": {"f:metadata": {"f:labels": {".": {}, "f:operators.coreos.com/ibm-zcon-zosconnect.openshift-operators": {}}}}, "manager": "olm", "operation": "Update", "time": "2023-09-12T13:04:14Z"}, {"apiVersion": "operators.coreos.com/v1alpha1", "fieldsType": "FieldsV1", "fieldsV1": {"f:status": {".": {}, "f:catalogHealth": {}, "f:conditions": {}, "f:lastUpdated": {}}}, "manager": "catalog", "operation": "Update", "subresource": "status", "time": "2023-09-12T13:05:57Z"}], "name": "ibm-zcon-zosconnect", "namespace": "openshift-operators", "resourceVersion": "103869", "uid": "9820294f-a47e-4eab-96ac-014572f173cc"}, "spec": {"channel": "v1.0", "installPlanApproval": "Automatic", "name": "ibm-zcon-zosconnect", "source": "ibm-operator-catalog", "sourceNamespace": "openshift-marketplace", "startingCSV": "ibm-zcon-zosconnect.v1.0.6"}, "status": {"catalogHealth": [{"catalogSourceRef": {"apiVersion": "operators.coreos.com/v1alpha1", "kind": "CatalogSource", "name": "certified-operators", "namespace": "openshift-marketplace", "resourceVersion": "102339", "uid": "c74956cd-39c7-42d6-8b90-69d9f6dcd108"}, "healthy": true, "lastUpdated": "2023-09-12T13:04:15Z"}, {"catalogSourceRef": {"apiVersion": "operators.coreos.com/v1alpha1", "kind": "CatalogSource", "name": "community-operators", "namespace": "openshift-marketplace", "resourceVersion": "102341", "uid": "cf5e43c0-9ed6-4ef6-ace6-ede5d44e41fd"}, "healthy": true, "lastUpdated": "2023-09-12T13:04:15Z"}, {"catalogSourceRef": {"apiVersion": "operators.coreos.com/v1alpha1", "kind": "CatalogSource", "name": "ibm-operator-catalog", "namespace": "openshift-marketplace", "resourceVersion": "102337", "uid": "2ea9e94b-c012-4d23-9490-381e8b4127ac"}, "healthy": true, "lastUpdated": "2023-09-12T13:04:15Z"}, {"catalogSourceRef": {"apiVersion": "operators.coreos.com/v1alpha1", "kind": "CatalogSource", "name": "ibm-zoscb-registry-zoscloudbroker-ibm-zmodstack-cloudbroker", "namespace": "openshift-marketplace", "resourceVersion": "102510", "uid": "c96fd9f4-937f-45fa-8560-98ad56d8762f"}, "healthy": true, "lastUpdated": "2023-09-12T13:04:15Z"}, {"catalogSourceRef": {"apiVersion": "operators.coreos.com/v1alpha1", "kind": "CatalogSource", "name": "redhat-marketplace", "namespace": "openshift-marketplace", "resourceVersion": "102340", "uid": "fcfe5e12-2e1f-4714-9a8d-59d793911581"}, "healthy": true, "lastUpdated": "2023-09-12T13:04:15Z"}, {"catalogSourceRef": {"apiVersion": "operators.coreos.com/v1alpha1", "kind": "CatalogSource", "name": "redhat-operators", "namespace": "openshift-marketplace", "resourceVersion": "102338", "uid": "146f0fca-b089-4047-90ae-56ba99bf1913"}, "healthy": true, "lastUpdated": "2023-09-12T13:04:15Z"}], "conditions": [{"lastTransitionTime": "2023-09-12T13:04:15Z", "message": "all available catalogsources are healthy", "reason": "AllCatalogSourcesHealthy", "status": "False", "type": "CatalogSourcesUnhealthy"}, {"message": "failed to populate resolver cache from source ibm-zoscb-registry-zoscloudbroker-ibm-zmodstack-cloudbroker/openshift-marketplace: failed to list bundles: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 172.30.146.48:50051: i/o timeout\"", "reason": "ErrorPreventedResolution", "status": "True", "type": "ResolutionFailed"}], "lastUpdated": "2023-09-12T13:05:56Z"}}]}
  2. Error I am getting while executing wazi-devspaces playbook task in Azure Bootstrap VM.

    TASK [wazi-devspaces : Create ansible-workspace directory - IBM Wazi for DevSpaces] **********************************************************************************************************************************
    fatal: [localhost]: FAILED! => {"changed": false, "msg": "There was an issue creating /root/zmodstack as requested: [Errno 13] Permission denied: b'/root/zmodstack'", "path": "/root/zmodstack/ansible-workspace/wazi-devspace/generated_yamls"}

    This error is because, the Ansible playbooks are executed by a different user, say vmadmin, which don't have permissions to access the path /root.

midhun6989 commented 1 year ago

@nikita-hakari I had a discussion with @ivandov today regarding the visibility of WaziDevispacesVersion in the CFN / ARM templates. For the user, it should be displayed as either 2.x or 3.x with the default one being 3.x. I believe the wazidevspacesversion has to be updated to 3.x here.

nikita-hakari commented 1 year ago

added the new changes to pick up the dir up without hardcoding the absolute path, this picks up dir using the variable "playbook_dir".

FYI @midhun6989

midhun6989 commented 1 year ago

@nikita-hakari While executing the Ansible playbook for zoscb role, I am still getting the same error for this task. I had a discussion with @ivandov earlier on this and the issue seems to be due to the assert conditions. We are of the opinion whether the below assert condition alone is sufficient for this task? broker_instance_info.resources[0].status.phase == "Successful"

midhun6989 commented 1 year ago

@nikita-hakari The Ansible playbook task Retrieve z/OS Connect subscription info is also failing after 30 retries. Below is the error I could see from the OpenShift console.

Screenshot 2023-09-22 at 12 55 13 PM
midhun6989 commented 1 year ago

@nikita-hakari Following Ansible playbook tasks for WaziDevspaces 2.x and 3.x to Retrieve IBM Wazi for DevSpaces subscription info are also failing after 30 retries.

  1. Task for WaziDevspaces 2.x
  2. Task for WaziDevspaces 3.x

Below are the errors I could see from the OpenShift console.

Error for WaziDevspaces 2.x

Screenshot 2023-09-22 at 1 27 55 PM

Error for WaziDevspaces 3.x

Screenshot 2023-09-22 at 1 32 16 PM
midhun6989 commented 1 year ago

@nikita-hakari

  1. z/OS Cloud broker Ansible playbooks are executing fine now in the Bootstrap VM of Azure OCP Cluster.

  2. Getting error while executing the Ansible playbook for z/OS Connect.

    TASK [zosconnect : Pull ZosConnect designer image via Podman] ******************
    fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to pull image icr.io/zosconnect/ibm-zcon-designer:3.0.71"}
  3. pause should be replaced with wait_for to fix error in Ansible playbooks for Wazi Devspaces 2 and Wazi Devspaces 3.

TASK [wazi-devspaces : Wait for 50sec for wazi License setup] ********************************************************************************************************************************************************
Pausing for 50 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 25] Inappropriate ioctl for device
fatal: [localhost]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""}
midhun6989 commented 1 year ago

@nikita-hakari

  1. z/OS Cloud broker Ansible playbooks are executing fine now in the Bootstrap VM of Azure OCP Cluster.
  2. Getting error while executing the Ansible playbook for z/OS Connect.
TASK [zosconnect : Pull ZosConnect designer image via Podman] ******************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to pull image icr.io/zosconnect/ibm-zcon-designer:3.0.71"}

This error was due to low space in the /home directory. When the task is executed as root user, images are pulled successfully.

  1. pause should be replaced with wait_for to fix error in Ansible playbooks for Wazi Devspaces 2 and Wazi Devspaces 3.
TASK [wazi-devspaces : Wait for 50sec for wazi License setup] ********************************************************************************************************************************************************
Pausing for 50 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 25] Inappropriate ioctl for device
fatal: [localhost]: FAILED! => {"msg": "Unexpected failure during module execution.", "stdout": ""}

Didn't encounter this error when the task is executed as root user.

nikita-hakari commented 1 year ago

To address the above issue and run the playbook independent of the user and dir structure wrt various architectures - I have used playbook_dir variable And also added enhancements as needed in this PR

midhun6989 commented 1 year ago

Closing the issue as the changes are merged into dev branch.