ansible-collections / ibm_zos_core

Red Hat Ansible Certified Content for IBM Z
75 stars 44 forks source link

[Bug] [zos_job_submit] Non-printable UTF-8 characters were present in this output #1512

Open gngrossi opened 1 month ago

gngrossi commented 1 month ago

Is there an existing issue for this?

Bug description

"Non-printable UTF-8 characters were present in this output. Please access it from the job log."

IBM z/OS Ansible core Version

v1.10.0-beta.1

IBM Z Open Automation Utilities

v1.3.1

IBM Enterprise Python

v3.12.x

ansible-version

v2.16.x (default)

z/OS version

v3.1 (unsupported)

Ansible module

zos_job_submit

Playbook verbosity output.

image

image

Ansible configuration.

No response

Contents of the inventory

No response

Contents of group_vars or host_vars

No response

ddimatos commented 1 month ago

@gngrossi - thank you for opening an issue, to assist you, we would need more than the message the module returned in the description.

I will assume at this point you encountered this message and are wondering why?

The issue originated when non-printable UTF* chars would enter the python I/O stream it would cause a UnicodeDecodeError (search our GH issues to see those) and fail the module yet the job would submit. Most these jobs usually involved machine characters embedded in job output or a compile job was also one that often drove such an issue.

To avoid the module failure, we caught error UnicodeDecodeError and inserted the message you are seeing. Then we worked with ZOAU to fix this in 1.3.0 and backport to 1.2.x, at that time I noticed that (only in 1.3.0) the new json interface would occasionally fail to escape special characters causing one of these 3 JSONDecodeError, TypeError, KeyError and then decided to catch those as well.

The JSONDecodeError and presumably TypeError, KeyError appear to be fixed yet not delivered by ZOAU, we have not validated this yet.

I think the development team needs to validate JIRA 10456, then update this code to be a bit more verbose and share which type of error we encountered since there are 4 types being caught here.

If you have ZOAU S&S , you could engage them on the JIRA, otherwise its not accessible .

This would be a Q3 validation work item for this team, once we know which version of ZOAU the fix will be in we can share that and you could advance on this sooner than Q3 by installing that version of ZOAU.

I assumed your reason for opening an issue, let me know if this is a correct assumption. Ansible team, see internal discussion: archives/C037EFBNPAN/p1711772489760769?thread_ts=1710271347.644869&cid=C037EFBNPAN

gngrossi commented 1 month ago

@ddimatos Thanks for the summary...much appreciated. This is working using collection ibm-ibm_zos_core-1.9.0 with zoau 1.2.5.8 and python 3.11.5. Just installed the 1.10 beta and switched to zoau 1.3.1.1 and python 3.12.1. Currently I'm in a POC for zoau and python. I'm re-testing the playbooks previously tested after making the documented changes for the new releases. For now, I'm using pax file installs for zoau and python without support. We are a RH AAP subscriber but have yet to begin testing with it instead of the local ansible engine running on my Vagrant VM. Our production implementation is to start with z/OS 3.1, collection 1.10 with zoau 1.3 and python 3.12 with support.

ddimatos commented 1 month ago

Related to: https://github.com/ansible-collections/ibm_zos_core/pull/1288

ddimatos commented 1 month ago

@gngrossi - you will be fine with ZOAU 1.2.5.6 and onwards for this issue, I linked the issue that represents that work. I expect the next release of ZOAU will have this for for 1.3.x, I will report back here and use this issue for the validation.

Since you mention z/OS 3.1 ,do note that we don't plan to test that till the end of this quarter which goes out with ibm_zos_core version 1.11.0-beta.1. If we don't encounter any issues we may be able to certify on the ibm_zos_core collection 1.10.0 , the concern I have with z/OS 3.1 is the new zsh shell and compatibility with things like tagging pipes which we have to provide soulutions for z/OS Shell. If we don't encounter any issues, we probably can do a regression on 1.10.0 and pass it when we GA. Team has been pretty occupied with the ZOAU 1.3.0 support and then the GDG and special chars support this quarter to get to z/OS 3.1.

Should you want to test it yourself fully, the test I created which will be committed to our next release is:

C:

#include <stdio.h>int main()
{
    /* Generate and print all EBCDIC characters to stdout to
     * ensure non-printable chars can be handled by Pythhon.
     * This will included the non-printables from DBB docs:
     * nl=0x15, cr=0x0D, lf=0x25, shiftOut=0x0E, shiftIn=0x0F
    */
    for (int i = 0; i <= 255; i++) {
        printf("Hex 0x%X is character: (%c)\\\\n",i,(char)(i));
    }    return 0;
}

Compile it:

export _CEE_RUNOPTS="FILETAG(AUTOCVT,AUTOTAG) POSIX(ON)";
export STEPLIB=CBC.SCCNCMP:$STEPLIB;
xlc -F xlc.cfg -o ebcdic_hex ebcdic_hex.c 

JCL Driver:

//NOEBCDIC    JOB (T043JM,JM00,1,0,0,0),'NOEBCDIC - JRM',
//             MSGCLASS=X,MSGLEVEL=1,NOTIFY=&SYSUID
//NOPRINT  EXEC PGM=BPXBATCH
//STDPARM DD *
SH (
cd /tmp;
./ebcdic_hex;
exit 0;
)
//STDIN  DD DUMMY
//STDOUT DD SYSOUT=*
//STDERR DD SYSOUT=*
// 

This is full spectrum testing for the entire code page.

gngrossi commented 1 month ago

@ddimatos Looking forward to the updates in June. Regarding 3.1, we are continuing to use /bin/sh as it's a standard for us defined in the RACF OMVS segment. thanks

gngrossi commented 1 month ago

@ddimatos Compiled and ran the code. thanks

ddimatos commented 1 month ago

ZOAU 1.3.2 will contain the fix for this issue, while the non-printable UTF-8 chars is not the issue it was how they were escaped in the JSON Parser , so the escaping will be fixed, this is a validation work item to ensure 1.3.2 ZOAU actually fixed the issue.

In addition to the validation, we will put a change before GA'ing 1.10.x such that the statement shares the exeception so we can troubleshoot should this ever happen again.

                            except (UnicodeDecodeError, JSONDecodeError, TypeError, KeyError) as e:
                                tmpcont = (
                                    "Non-printable UTF-8 characters were present in this output. "
                                    "Please access it from the job log."
                                )

Lastly, the test case from the 1.9.0 collection be added with a gate that it only run when ZOAU is >= 1.3.2.

That test case is below, its not linked because its in a staging branch that is short lived currently.

C_SRC_INVALID_UTF8 = """#include <stdio.h>
int main()
{
    /* Generate and print all EBCDIC characters to stdout to
     * ensure non-printable chars can be handled by Python.
     * This will included the non-printable hex from DBB docs:
     * nl=0x15, cr=0x0D, lf=0x25, shiftOut=0x0E, shiftIn=0x0F
    */

    for (int i = 0; i <= 255; i++) {
        printf("Hex 0x%X is character: (%c)\\\\n",i,(char)(i));
    }

    return 0;
}
"""
# This test case is related to the following GitHub issues:
# - https://github.com/ansible-collections/ibm_zos_core/issues/677
# - https://github.com/ansible-collections/ibm_zos_core/issues/972
# - https://github.com/ansible-collections/ibm_zos_core/issues/1160
# - https://github.com/ansible-collections/ibm_zos_core/issues/1255
def test_zoau_bugfix_invalid_utf8_chars(ansible_zos_module):
    try:
        hosts = ansible_zos_module
        # Copy C source and compile it.
        hosts.all.file(path=TEMP_PATH, state="directory")
        hosts.all.shell(
            cmd="echo {0} > {1}/noprint.c".format(quote(C_SRC_INVALID_UTF8), TEMP_PATH)
        )
        hosts.all.shell(cmd="xlc -o {0}/noprint {0}/noprint.c".format(TEMP_PATH))
        # Create local JCL and submit it.
        tmp_file = tempfile.NamedTemporaryFile(delete=True)
        with open(tmp_file.name, "w") as f:
            f.write(JCL_INVALID_UTF8_CHARS_EXC.format(TEMP_PATH))

        results = hosts.all.zos_job_submit(
            src=tmp_file.name,
            location="LOCAL",
            wait_time_s=15
        )

        for result in results.contacted.values():
            print(result)
            # We shouldn't get an error now that ZOAU handles invalid/unprintable
            # UTF-8 chars correctly.
            assert result.get("jobs")[0].get("ret_code").get("msg_code") == "0000"
            assert result.get("jobs")[0].get("ret_code").get("code") == 0
            assert result.get("changed") is True
    finally:
        hosts.all.file(path=TEMP_PATH, state="absent")
ddimatos commented 1 month ago

@gngrossi - Hi, I noted above but wanted to point out that ZOAU 1.3.2 will include this fix for the JSON escaping, we plan to validate an development build of 1.3.2 and make the above changes before we GA IBM z/OS core 1.10.0 around the end of this quarter or early next quarter.

If 1.3.2 releases before we get to it, you can also validate this if you wish just by upgrading ZOAU.

ddimatos commented 2 weeks ago

Since this is a bit difficult to recreate as a functional test I have prototyped the code changes and the message change, although this will only impact users who are not on the upcoming 1.3.2 or later, we will have it placed in our development branch to absorb any future regressions. json-test.txt

Rename json-test.txt to json-test.py and run from terminal, see below output for all 4 error types caught.

/tmp $: python3 json-test.py
Non-printable UTF-8 characters were present in this output, a JSONDecodeError error has occurred.Please access the content from the job log.
Non-printable UTF-8 characters were present in this output, a TypeError error has occurred.Please access the content from the job log.
Non-printable UTF-8 characters were present in this output, a KeyError error has occurred.Please access the content from the job log.
Non-printable UTF-8 characters were present in this output, a UnicodeDecodeError error has occurred.Please access the content from the job log.

Source attached in file:

import json
from json import JSONDecodeError

json_decode_error='[{"name": "Laura Harper","equip_id" "309"}]'

# Force a JSONDecodeError example
try:
    data = json.loads(json_decode_error)
    print(data)
except UnicodeDecodeError as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)
except (JSONDecodeError, TypeError, KeyError) as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)

# Force a JSON TypeError 

json_type_error_data = '{ "numbers": [1, 2, "3"] }' 

try:
    data = json.loads(json_type_error_data)
    numbers = data["numbers"]
    total = sum(numbers)
except UnicodeDecodeError as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)
except (JSONDecodeError, TypeError, KeyError) as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)

# Force a JSON KeyError 
json_key_error_data ={"key":{"status":{"id":1,"salary":100}}}

try:
    data = json.loads(json_type_error_data)
    numbers = data["notAkey"]
except UnicodeDecodeError as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)
except (JSONDecodeError, TypeError, KeyError) as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)

# Force a UnicodeDecodeError 
# ASCII encoding only allows characters within the range 0 to 127, Hex F1 is outside this range , F1 is decimal value 241

try:
    print(b"\xF1".decode("ascii"))
except UnicodeDecodeError as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)
except (JSONDecodeError, TypeError, KeyError) as e:
    error_type = e.__class__.__name__
    tmpcont = (
        f"Non-printable UTF-8 characters were present in this output, a {error_type} error has occurred."
        "Please access the content from the job log."
    )
    print(tmpcont)
gngrossi commented 1 day ago

Using python 3.12.3, zoau 1.3.2.0, ibm_zos_core 1.10.0 and ansible 2.16 with z/OS 3.1 and still receiving "Non-printable UTF-8 characters were present in this output. Please access it from the job log."

image

ddimatos commented 2 hours ago

@gngrossi - Hi, thanks for updating, the source attached here and in this PR will merge into IBM z/OS core 1.11-beta.1 in the coming month, this will show us exactly what type of error you are seeing , as it is coded today , we can't tell if its really a UnicodeDecodeError, JSON KeyError or a JSON TypeError.

While I am surprised, because I saw the ZOAU test case as I originally uncovered this, I don't believe the UnicodeDecodeError is coming through because we back-ported the fix to ZOAU 1.2.X and its working fine, and in 1.3.x a JSON parser was introduced causing either a keyError or TypeError, and as I mentioned, I saw the test case pass in ZOAU so I am really interested i knowing what the upcoming 1.11.0-beta.1 will show us because as it is now, the error is being masked by a generic message. The main difference between ZOAU 1.2.x and 1.3.x for this condition is 1.3.x has a new JSON parser which is triggering your error.

There are some work arounds in the meantime if you are interested in them I can point them out.

image