IBM / cloud-pak-deployer

Configuration-based installation of OpenShift and Cloud Pak for Data/Integration/Watson AIOps on various private and public cloud infrastructure providers. Deployment attempts to achieve the end-state defined in the configuration. If something fails along the way, you only need to restart the process to continue the deployment.
https://ibm.github.io/cloud-pak-deployer/
Apache License 2.0
130 stars 65 forks source link

Running "env download" fails on Windows in WSL2 when STATUS_DIR on Windows mount #629

Closed m-g-k closed 5 months ago

m-g-k commented 5 months ago

The Ansible copycommand fails when running env download on Windows in WSL2 when the STATUS_DIR points to a Windows mount. The steps to see the issue are:

1: Set STATUS_DIR to a Windows mount and run env download:

export STATUS_DIR=/mnt/c/temp/cp-deployer-status
./cp-deploy.sh env download

This is the output showing the failure:

TASK [cp4d-variables : Run list-components command] ****************************
Tuesday 23 January 2024  09:35:30 +0000 (0:00:00.036)       0:00:52.810 *******
TASK [cp4d-variables : Copy file to /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv] ***
Tuesday 23 January 2024  09:36:25 +0000 (0:00:55.509)       0:01:48.320 *******
fatal: [localhost]: FAILED! => {"changed": false, "msg": "failed to copy: /tmp/work/components.csv to /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv", "traceback": "Traceback (most recent call last):\n  File \"/tmp/ansible_ansible.legacy.copy_payload_075t5tnf/ansible_ansible.legacy.copy_payload.zip/ansible/modules/copy.py\", line 684, in main\n  File \"/tmp/ansible_ansible.legacy.copy_payload_075t5tnf/ansible_ansible.legacy.copy_payload.zip/ansible/module_utils/basic.py\", line 2469, in atomic_move\n    os.chmod(b_dest, DEFAULT_PERM & ~umask)\nFileNotFoundError: [Errno 2] No such file or directory: b'/mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv'\n"}
PLAY RECAP *********************************************************************
localhost                  : ok=178  changed=12   unreachable=0    failed=1    skipped=68   rescued=0    ignored=0

At this point if you now look at the /mnt/c/temp/cp-deployer-status/cp4d folder you will see the cpd-4.8.0-components.csv file IS actually present even though the command failed.

2: run ./cp-deploy.sh env download again with no changes to any files and this is the second failure output:

TASK [cp4d-variables : Run list-components command] ****************************
Tuesday 23 January 2024  09:47:02 +0000 (0:00:00.045)       0:00:49.880 *******
TASK [cp4d-variables : Copy file to /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv] ***
Tuesday 23 January 2024  09:47:02 +0000 (0:00:00.035)       0:00:49.915 *******
TASK [cp4d-variables : Check if file /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv exists] ***
Tuesday 23 January 2024  09:47:02 +0000 (0:00:00.036)       0:00:49.951 *******
TASK [cp4d-variables : Fail if /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv does not exist] ***
Tuesday 23 January 2024  09:47:02 +0000 (0:00:00.323)       0:00:50.274 *******
TASK [cp4d-variables : Get column headers] *************************************
Tuesday 23 January 2024  09:47:02 +0000 (0:00:00.031)       0:00:50.306 *******
TASK [cp4d-variables : Copy file to _cp4d_components_file] *********************
Tuesday 23 January 2024  09:47:02 +0000 (0:00:00.300)       0:00:50.607 *******
[WARNING]: File '/mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components-no-
header.csv' created with default permissions '666'. The previous default was
'666'. Specify 'mode' to avoid this warning.
fatal: [localhost]: FAILED! => {"changed": false, "msg": "failed to copy: /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components.csv to /mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components-no-header.csv", "traceback": "Traceback (most recent call last):\n  File \"/tmp/ansible_ansible.legacy.copy_payload_pt_jv9zq/ansible_ansible.legacy.copy_payload.zip/ansible/modules/copy.py\", line 684, in main\n  File \"/tmp/ansible_ansible.legacy.copy_payload_pt_jv9zq/ansible_ansible.legacy.copy_payload.zip/ansible/module_utils/basic.py\", line 2469, in atomic_move\n    os.chmod(b_dest, DEFAULT_PERM & ~umask)\nFileNotFoundError: [Errno 2] No such file or directory: b'/mnt/c/temp/cp-deployer-status/cp4d/cpd-4.8.0-components-no-header.csv'\n"}
PLAY RECAP *********************************************************************
localhost                  : ok=177  changed=12   unreachable=0    failed=1    skipped=73   rescued=0    ignored=0

Now take another look at the /mnt/c/temp/cp-deployer-status/cp4d folder you will see the cpd-4.8.0-components-no-header.csv file IS actually present even though the command failed saying it did not exist.

3: Now run the command again for a third time, and this time the command succeeds and the env download command will download all required images and complete successfully.

Possible fix / work around Given that the pattern seems to be that the first copy fails and the second succeeds we can add a rescue block to the copy to try again in the event of failure. This is what this change would look like:

For the first file cpd-4.8.0-components.csv:

- block:
    - name: Copy file to {{ _cp4d_components_file }}
      copy:
        src: /tmp/work/components.csv
        dest: "{{ _cp4d_components_file }}"
        remote_src: True
        force: True
        mode: u+rwx,g+rwx,o+rwx
    rescue:
      - name: Rescue copy file to {{ _cp4d_components_file }}
        copy:
          src: /tmp/work/components.csv
          dest: "{{ _cp4d_components_file }}"

For the second file cpd-4.8.0-components-no-header.csv:

- block:
  - name: Copy file to {{ _cp4d_components_no_header_file }}
    copy:
      src: "{{ _cp4d_components_file }}"
      dest: "{{ _cp4d_components_no_header_file }}"
      remote_src: True
  rescue:
    - name: Rescue copy file to {{ _cp4d_components_no_header_file }}
      copy:
        src: "{{ _cp4d_components_file }}"
        dest: "{{ _cp4d_components_no_header_file }}"

With these changes in place, the env download runs and completes successfully as expected and you can see that both rescue blocks are executed.