ansible-collections / ibm_zos_core

Red Hat Ansible Certified Content for IBM Z
75 stars 44 forks source link

[Enabler] [zos_mvs_raw] Explore options towards a solution for the UTF-8 warning for module zos_mvs_raw #1532

Closed ketankelkar closed 3 weeks ago

ketankelkar commented 3 weeks ago

This item is a sub-task for:

I have identified 3 use cases and set up playbooks to replicate those use cases with output going through dd_output, dd_data_set, and dd_unix in order to identify any gaps.

The 3 use cases are:

  1. bpxbatch SH uptime - calls uptime on the system and saves output to the stdout dd
  2. bpxbatch SH echo abcd - calls 'echo abcd' to the stdout dd
  3. cat files - creates and cats tagged and untagged ebcdic and utf-8 encoded files to the stdout dd

The journey and some notes:

I started off by experimenting with the module to recreate the warning.

Initially, I leveraged the return_contents module option, specifically by setting the sub-option: response_encoding to ibm-1047.

- name: uptime into dd_output
  tags: a, dd_output
  block:
    - name: mvs raw - uptime - output to stdout
      zos_mvs_raw:
        verbose: true
        program_name: bpxbatch
        parm: "SH uptime"
        dds:
        - dd_output:
            dd_name: stdout
            return_content:
              type: text
              response_encoding: ibm-1047
      register: output
    - name: print output
      debug:
        var: output
TASK [print output] ****************************************************************************************************************************************************************************************************************************************************************
[DEPRECATION WARNING]: Non UTF-8 encoded data replaced with "?" while displaying text to stdout/stderr, this is temporary and will become an error. This feature will be removed in version 2.18. Deprecation warnings can be disabled by setting deprecation_warnings=False in 
ansible.cfg.
ok: [ec33013] => {
    "output": {
        "backups": [],
        "changed": true,
        "dd_names": [
            {
                "byte_count": 5,
                "content": [
                    "\u0015?\u0015"
                ],
                "dd_name": "stdout",
                "name": "OMVSADM.P6777481.T0573437.C0000000",
                "record_count": 1
            }
        ],
        "failed": false,
        "ret_code": {
            "code": 0
        }
    }
}

This triggered the warning, but while exploring options to get around the warning, I determined that perhaps setting the response_encoding to an ebcdic value should be considered a "user-error". This approach does not seem to reduce any functionality on the z/OS side since the DDs would already be read/written in the correct encoding to perform the z/OS task before the get_dd_output function is called to return the results back to the Ansible controller.

Next, I looked to create output which may have arbitrary encoding at the source (on z/OS) which may break the get_dd_output helper function. z/OS data sets containing text are typically encoded in EBCDIC; when the content is binary data (or utf-8, which is treated as binary), the contents cannot be "read" in ISPF, but for text, cat and dcat seem to offer reliable functionality in retrieving output to the ansible controller after passing it through iconv -f ebcdic -t utf8). For z/OS UNIX files, there are 4 options to consider - tagged and untagged files encoded in EBCDIC and UTF-8. I discovered that only the untagged UTF-8-encoded file does not pass through properly back to the ansible controller. There was however a work-around i discovered which adds the -Wfilecodeset=iso8859-1 to the cat command to have the src interepreted explicitly as utf-8 encoded.

- name: Set facts
  tags: always
  ansible.builtin.set_fact:
    some_ds: "{{ ansible_user | upper }}.ANSIBLE.MVSRAW.SOMEDS"

    files_dir: '/home/{{ ansible_user }}/mvsraw/files'
    file_names:
      - "tagged-ibm1047.txt"
      - "tagged-utf8.txt"
      - "untagged-ibm1047.txt"
      - "untagged-utf8.txt"

    data_set_names:
      - "{{ ansible_user | upper }}.ANSIBLE.MVSRAW.TAGGED.IBM1047"
      - "{{ ansible_user | upper }}.ANSIBLE.MVSRAW.UNTAGGED.IBM1047"
      - "{{ ansible_user | upper }}.ANSIBLE.MVSRAW.TAGGED.UTF8"
      - "{{ ansible_user | upper }}.ANSIBLE.MVSRAW.UNTAGGED.UTF8"

- name: set up files.
  tags: always
  ansible.builtin.raw: |
    mkdir -p {{ files_dir }} ;
    echo 'abcd' > {{ files_dir }}/untagged-ibm1047.txt ;
    chtag -r {{ files_dir }}/untagged-ibm1047.txt ;

    cp {{ files_dir }}/untagged-ibm1047.txt {{ files_dir }}/tagged-ibm1047.txt ;
    chtag -tc ibm-1047 {{ files_dir }}/tagged-ibm1047.txt

    iconv -f ibm-1047 -t iso8859-1 {{ files_dir }}/untagged-ibm1047.txt > {{ files_dir }}/untagged-utf8.txt
    chtag -r {{ files_dir }}/untagged-utf8.txt;

    cp {{ files_dir }}/untagged-utf8.txt {{ files_dir }}/tagged-utf8.txt ;
    chtag -tc iso8859-1 {{ files_dir }}/tagged-utf8.txt

- name: cat files into dd_output.
  tags: a, dd_output
  block:
    - name: mvs raw - cat files - output to dd_output.
      zos_mvs_raw:
        verbose: true
        program_name: bpxbatch
        parm: "SH cat {{files_dir}}/{{ item }}"
        dds:
        - dd_output:
            dd_name: stdout
            return_content:
              type: text
              response_encoding: "iso8859-1"
      register: output
      loop: "{{ file_names }}"
      loop_control:
        index_var: my_idx
    - name: print only content from dd_names.
      debug: 
        msg: 
          - "string: {{ output.results[my_idx].dd_names[0].content }}"
          # - "string: {{ output.results[my_idx].dd_names[0].content[7] }}"
      loop: "{{ file_names }}"
      loop_control:
        index_var: my_idx

I observed similar behavior as above when I used dd_data_set and dd_unix too.