IBM / k8s-storage-perf

This git repo will host the playbooks for collecting performance metrics for a Kubernetes persistent storage for IBM Cloud Paks
Apache License 2.0
9 stars 18 forks source link

fails to execute with either method #21

Closed elfner closed 1 year ago

elfner commented 1 year ago

Hi - I'm running on a RHEL 8 system, against an OCP 4.12 cluster, and both launch method fail:

Playbook native

$ ansible-playbook main.yml --extra-vars "@./params.yml"
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

PLAY [localhost] *******************************************************************************************************************

TASK [ocp login using creds] *******************************************************************************************************
skipping: [localhost]

TASK [ocp login using token] *******************************************************************************************************
skipping: [localhost]

TASK [ocp login using apikey] ******************************************************************************************************
skipping: [localhost]

TASK [debug] ***********************************************************************************************************************
skipping: [localhost]

TASK [debug] ***********************************************************************************************************************
skipping: [localhost]

TASK [debug] ***********************************************************************************************************************
skipping: [localhost]

TASK [Storage Performance] *********************************************************************************************************

TASK [storage-perf-test : Setup PVCs to run test on] *******************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4ca8779550>: Failed to establish a new connection: [Errno 111] Connection refused',))"}

PLAY RECAP *************************************************************************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=1    skipped=6    rescued=0    ignored=0   

Playbook in container

$ ${dockerexe} run --name ${container_name} -d -v /tmp/k8s_storage_perf/work-dir:/tmp/work-dir ${docker_image}
6d0eed8856746d554987c9a7c6bd14ca9f5f8329183fd4642b4cfa96ae16d9fe
$ run_k8s_storage_perf

PLAY [localhost] ***************************************************************

TASK [ocp login using creds] ***************************************************
skipping: [localhost]

TASK [ocp login using token] ***************************************************
skipping: [localhost]

TASK [debug] *******************************************************************
skipping: [localhost]

TASK [debug] *******************************************************************
skipping: [localhost]

TASK [Storage Performance] *****************************************************
[DEPRECATION WARNING]: community.kubernetes.k8s has been deprecated. The 
community.kubernetes collection is being renamed to kubernetes.core. Please 
update your FQCNs to kubernetes.core instead. This feature will be removed from
 community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by
 setting deprecation_warnings=False in ansible.cfg.
[DEPRECATION WARNING]: community.kubernetes.k8s_log has been deprecated. The 
community.kubernetes collection is being renamed to kubernetes.core. Please 
update your FQCNs to kubernetes.core instead. This feature will be removed from
 community.kubernetes in version 3.0.0. Deprecation warnings can be disabled by
 setting deprecation_warnings=False in ansible.cfg.

TASK [storage-perf-test : Setup PVCs to run test on] ***************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to load kubeconfig due to Invalid kube-config file. No configuration found."}

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=1    skipped=4    rescued=0    ignored=0
bxu1999 commented 1 year ago

Hi @elfner did you modify the params.yml file to configure one of the three OCP login methods? From the log segment below,

......
TASK [ocp login using creds] *******************************************************************************************************
skipping: [localhost]

TASK [ocp login using token] *******************************************************************************************************
skipping: [localhost]

TASK [ocp login using apikey] ******************************************************************************************************
skipping: [localhost]
......

All three ocp logins were skipped!

Can you please provide your params.yml contents here? Thanks.

elfner commented 1 year ago
# OCP Parameters
ocp_url: https://api.[redacted]:6443
ocp_username: <username>  # a cluster admin user
ocp_password: <password>
ocp_token: sha256~cSV6yTt[redacted]HaRl8Es   #<required if user/password not available>

storageClass_ReadWriteOnce:  "ocs-storagecluster-ceph-rbd"
storageClass_ReadWriteMany: "ocs-storagecluster-cephfs"

run_storage_perf: true

arch: amd64  # amd64, ppc64le

############################ STORAGE PERFORMANCE PARAMETERS START #######################
# vars file for roles/storage-perf-test
storage_perf_namespace: storageperf2  # openshift namespace/project where jobs will be executed, it will be created by playbook if not already existing.
logfolder: '.logs'
imageurl: quay.io/ibm-cp4d-public/xsysbench:1.1

cluster_infrastructure: scg_azure # optional labels eg ibmcloud, aws, azure, vmware
cluster_name: client_azure # optional labels 
storage_type: ocs # optional label eg portworx, ocs, <storage vendor>

# To run the performace jobs on a dedicated compute nodes, set the node label which meet the criteria.
# The idea is to gather performance data when the jobs are running remotely from a storage node.
# A cluster administrator can label a node by running this query with appropriate label key/value: 
# oc label node <node name> "<label_key>=<label_value>" --overwrite
dedicated_compute_node:
   label_key: "<optional>"
   label_value: "<optional>"

rwx_storagesize: 10Gi
rwo_storagesize: 10Gi

#sysbench random read
sysbench_random_read: false
rread_threads: 8            # 1,4,8,16
rread_fileTotalSize: 128m
rread_fileNum: 128
rread_fileBlockSize: 4k     # 4k,8k,16k

#sysbench random write
sysbench_random_write: true
rwrite_threads: 8           # 1,4,8,16
rwrite_fileTotalSize: 4096m
rwrite_fileNum: 4
rwrite_fileBlockSize: 4k    # 4k,8k,16k

#sysbench sequential read
sysbench_sequential_read: false
sread_threads: 2            # 1,2
sread_fileTotalSize: 4096m
sread_fileNum: 4
sread_fileBlockSize: 1g     # 512m,1g

#sysbench sequential write
sysbench_sequential_write: true
swrite_threads: 2           # 1,2
swrite_fileTotalSize: 4096m
swrite_fileNum: 4
swrite_fileBlockSize: 1g    # 512m,1g

############################ STORAGE PERFORMANCE PARAMETERS END #########################
bxu1999 commented 1 year ago
  1. First refresh this project on your host. From the contents above, you are using an old version of the project.
  2. Second, for the login section below:
    # OCP Parameters
    ocp_url: https://<required>:6443
    ocp_username: <required>  # a cluster admin user
    ocp_password: <required>
    ocp_token: <required if user/password not available>
    ocp_apikey: <required if neither user/password or token not available>

    If you choose the "ocp_token" method, just modify that line with your token. Do NOT modify any other login lines.

Try the above again.

elfner commented 1 year ago

It still fails in the same manner:

$ cat params.yml
# OCP Parameters
ocp_url: https://api.[redacted]:6443
ocp_username: <required>  # a cluster admin user
ocp_password: <required>
ocp_token: sha256~yfslGQ[redacted]CbH5HAkM

run_storage_readiness: true

############################ STORAGE VALIDATION PARAMETERS START ########################

# REQUIRED PARAMETERS
storageClass_ReadWriteOnce: ocs-storagecluster-ceph-rbd # eg "ocs-storagecluster-ceph-rbd"
storageClass_ReadWriteMany: ocs-storagecluster-cephfs  # eg "ocs-storagecluster-cephfs"
storage_validation_namespace: axtest

# OPTIONAL PARAMETERS
prefix: "readiness"
storageSize: "1Gi"
options: ""
backoffLimit: 5

arch: amd64  # amd64, ppc64le

docker_registry: "quay.io"

############################ STORAGE VALIDATION PARAMETERS END ##########################

$ ansible-playbook main.yml --extra-vars "@./params.yml" | tee output.log [WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

PLAY [localhost] ***

TASK [ocp login using creds] *** skipping: [localhost]

TASK [ocp login using token] *** changed: [localhost]

TASK [debug] *** skipping: [localhost]

TASK [debug] *** ok: [localhost] => { "login_token.stdout_lines": [ "WARNING: Using insecure TLS client config. Setting this option is not supported!", "", "Logged into \"https://api.[redacted]:6443\" as \"aelfner\" using the token provided.", "", "You have access to 82 projects, the list has been suppressed. You can list all projects with 'oc projects'", "", "Using project \"axtest\".", "", "You are accessing a U.S. Government (USG) Information System (IS) that is", "provided for USG-authorized use only. By using this IS (which includes any", "device attached to this IS), you consent to the following conditions: The", "USG routinely intercepts and monitors communications on this IS for purposes", "including, but not limited to, penetration testing, COMSEC monitoring, network", "operations and defense, personnel misconduct (PM), law enforcement (LE), and", "counterintelligence (CI) investigations. At any time, the USG may inspect and", "seize data stored on this IS. Communications using, or data stored on, this IS", "are not private, are subject to routine monitoring, interception, and search,", "and may be disclosed or used for any USG-authorized purpose. This IS includes", "security measures (e.g., authentication and access controls) to protect USG", "interests -- not for your personal benefit or privacy. Notwithstanding the", "above, using this IS does not constitute consent to PM, LE or CI investigative", "searching or monitoring of the content of privileged communications, or", "work product, related to personal representation or services by attorneys,", "psychotherapists, or clergy, and their assistants. Such communications and", "work product are private and confidential. See User Agreement for details." ] }

TASK [Storage Readiness] ***

TASK [storage-readiness : Create namespace axtest if not present] ** fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3efc646470>: Failed to establish a new connection: [Errno 111] Connection refused',))"}

PLAY RECAP ***** localhost : ok=2 changed=1 unreachable=0 failed=1 skipped=2 rescued=0 ignored=0

bxu1999 commented 1 year ago

No. Now the OCP login step worked fine as shown below:

...
TASK [debug] *******************************************************************
ok: [localhost] => {
"login_token.stdout_lines": [
"WARNING: Using insecure TLS client config. Setting this option is not supported!",
"",
"Logged into "https://api.[redacted]:6443" as "aelfner" using the token provided.",
"",
"You have access to 82 projects, the list has been suppressed. You can list all projects with 'oc projects'",
"",
"Using project "axtest".",
....

What failed is to create the namespace as below:

......
TASK [storage-readiness : Create namespace axtest if not present] **************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f3efc646470>: Failed to establish a new connection: [Errno 111] Connection refused',))"}

Try to manually run the namespace creation command to check if your oc client and k8s context are correctly set up on the host.

cat << EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: axtest
spec: {}
EOF

axtest already exists, you can use another one. If this does not work, it means your oc client and k8s context are not set up correctly.

elfner commented 1 year ago

Hi. I created the target namespace ahead of time because it is referenced in the configuration file and the instructions didn't say it should not exist. In any case, removing the namespace and rerunning also fails:

$ ansible-playbook main.yml --extra-vars "@./params.yml" | tee output.log
[WARNING]: provided hosts list is empty, only localhost is available. Note that
the implicit localhost does not match 'all'

PLAY [localhost] ***************************************************************

TASK [ocp login using creds] ***************************************************
skipping: [localhost]

TASK [ocp login using token] ***************************************************
changed: [localhost]

TASK [debug] *******************************************************************
skipping: [localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "login_token.stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.[redacted]:6443\" as \"aelfner\" using the token provided.",
        "",
        "You have access to 85 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"default\".",
        "",
        "You are accessing a U.S. Government (USG) Information System (IS) that is",
        "provided for USG-authorized use only. By using this IS (which includes any",
        "device attached to this IS), you consent to the following conditions: The",
        "USG routinely intercepts and monitors communications on this IS for purposes",
        "including, but not limited to, penetration testing, COMSEC monitoring, network",
        "operations and defense, personnel misconduct (PM), law enforcement (LE), and",
        "counterintelligence (CI) investigations. At any time, the USG may inspect and",
        "seize data stored on this IS. Communications using, or data stored on, this IS",
        "are not private, are subject to routine monitoring, interception, and search,",
        "and may be disclosed or used for any USG-authorized purpose. This IS includes",
        "security measures (e.g., authentication and access controls) to protect USG",
        "interests -- not for your personal benefit or privacy. Notwithstanding the",
        "above, using this IS does not constitute consent to PM, LE or CI investigative",
        "searching or monitoring of the content of privileged communications, or",
        "work product, related to personal representation or services by attorneys,",
        "psychotherapists, or clergy, and their assistants. Such communications and",
        "work product are private and confidential. See User Agreement for details."
    ]
}

TASK [Storage Readiness] *******************************************************

TASK [storage-readiness : Create namespace axtest if not present] **************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f04d939f710>: Failed to establish a new connection: [Errno 111] Connection refused',))"}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0

My id has cluster admin authority, so creating a namespace with your example is not an issue:

$ cat << EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: axtest
spec: {}
EOF

namespace/axtest created
elfner commented 1 year ago

Looks like it's failing in the Python ansible module somewhere - running ansible-playbook with '-vvv-:

ansible-playbook 2.10.17
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/elfner/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 3.6.8 (default, May 31 2023, 10:28:59) [GCC 8.5.0 20210514 (Red Hat 8.5.0-18)]
Using /etc/ansible/ansible.cfg as config file
host_list declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
script declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
auto declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Parsed /etc/ansible/hosts inventory source with ini plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.

PLAYBOOK: main.yml *************************************************************
1 plays in main.yml

PLAY [localhost] ***************************************************************
META: ran handlers

TASK [ocp login using creds] ***************************************************
task path: /home/elfner/tmp.work/IPS4GRO/customers/[redacted]/[redacted]/[redacted]-storage-testing/k8s-storage-tests/main.yml:8
skipping: [localhost] => {
    "changed": false,
    "skip_reason": "Conditional result was False"
}

TASK [ocp login using token] ***************************************************
task path: /home/elfner/tmp.work/IPS4GRO/customers/[redacted]/[redacted]/[redacted]-storage-testing/k8s-storage-tests/main.yml:13
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: elfner
<127.0.0.1> EXEC /bin/sh -c 'echo ~elfner && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/elfner/.ansible/tmp `"&& mkdir "` echo /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774 `" && echo ansible-tmp-1696283787.7559903-596614-114520488674774="` echo /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774 `" ) && sleep 0'
Using module file /usr/local/lib/python3.6/site-packages/ansible/modules/command.py
<127.0.0.1> PUT /home/elfner/.ansible/tmp/ansible-local-596605nq2fsp_8/tmpyus6l5h9 TO /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774/AnsiballZ_command.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774/ /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774/AnsiballZ_command.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3.6 /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774/AnsiballZ_command.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /home/elfner/.ansible/tmp/ansible-tmp-1696283787.7559903-596614-114520488674774/ > /dev/null 2>&1 && sleep 0'
changed: [localhost] => {
    "changed": true,
    "cmd": "oc login --server=https://api.zocp[redacted]001.[redacted].tenant:6443 --token=sha256~psYOMjbqsw3--[redacted] --insecure-skip-tls-verify=true",
    "delta": "0:00:01.259278",
    "end": "2023-10-02 14:56:29.223941",
    "invocation": {
        "module_args": {
            "_raw_params": "oc login --server=https://api.zocp[redacted]001.[redacted].tenant:6443 --token=sha256~psYOMjbqsw3--[redacted] --insecure-skip-tls-verify=true",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true,
            "warn": true
        }
    },
    "msg": "",
    "rc": 0,
    "start": "2023-10-02 14:56:27.964663",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "WARNING: Using insecure TLS client config. Setting this option is not supported!\n\nLogged into \"https://api.zocp[redacted]001.[redacted].tenant:6443\" as \"aelfner\" using the token provided.\n\nYou have access to 88 projects, the list has been suppressed. You can list all projects with 'oc projects'\n\nUsing project \"openshift-network-diagnostics\".\n\nYou are accessing a U.S. Government (USG) Information System (IS) that is\nprovided for USG-authorized use only. By using this IS (which includes any\ndevice attached to this IS), you consent to the following conditions: The\nUSG routinely intercepts and monitors communications on this IS for purposes\nincluding, but not limited to, penetration testing, COMSEC monitoring, network\noperations and defense, personnel misconduct (PM), law enforcement (LE), and\ncounterintelligence (CI) investigations. At any time, the USG may inspect and\nseize data stored on this IS. Communications using, or data stored on, this IS\nare not private, are subject to routine monitoring, interception, and search,\nand may be disclosed or used for any USG-authorized purpose. This IS includes\nsecurity measures (e.g., authentication and access controls) to protect USG\ninterests -- not for your personal benefit or privacy. Notwithstanding the\nabove, using this IS does not constitute consent to PM, LE or CI investigative\nsearching or monitoring of the content of privileged communications, or\nwork product, related to personal representation or services by attorneys,\npsychotherapists, or clergy, and their assistants. Such communications and\nwork product are private and confidential. See User Agreement for details.",
    "stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.zocp[redacted]001.[redacted].tenant:6443\" as \"aelfner\" using the token provided.",
        "",
        "You have access to 88 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"openshift-network-diagnostics\".",
        "",
        "You are accessing a U.S. Government (USG) Information System (IS) that is",
        "provided for USG-authorized use only. By using this IS (which includes any",
        "device attached to this IS), you consent to the following conditions: The",
        "USG routinely intercepts and monitors communications on this IS for purposes",
        "including, but not limited to, penetration testing, COMSEC monitoring, network",
        "operations and defense, personnel misconduct (PM), law enforcement (LE), and",
        "counterintelligence (CI) investigations. At any time, the USG may inspect and",
        "seize data stored on this IS. Communications using, or data stored on, this IS",
        "are not private, are subject to routine monitoring, interception, and search,",
        "and may be disclosed or used for any USG-authorized purpose. This IS includes",
        "security measures (e.g., authentication and access controls) to protect USG",
        "interests -- not for your personal benefit or privacy. Notwithstanding the",
        "above, using this IS does not constitute consent to PM, LE or CI investigative",
        "searching or monitoring of the content of privileged communications, or",
        "work product, related to personal representation or services by attorneys,",
        "psychotherapists, or clergy, and their assistants. Such communications and",
        "work product are private and confidential. See User Agreement for details."
    ]
}

TASK [debug] *******************************************************************
task path: /home/elfner/tmp.work/IPS4GRO/customers/[redacted]/[redacted]/[redacted]-storage-testing/k8s-storage-tests/main.yml:18
skipping: [localhost] => {}

TASK [debug] *******************************************************************
task path: /home/elfner/tmp.work/IPS4GRO/customers/[redacted]/[redacted]/[redacted]-storage-testing/k8s-storage-tests/main.yml:22
ok: [localhost] => {
    "login_token.stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.zocp[redacted]001.[redacted].tenant:6443\" as \"aelfner\" using the token provided.",
        "",
        "You have access to 88 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"openshift-network-diagnostics\".",
        "",
        "You are accessing a U.S. Government (USG) Information System (IS) that is",
        "provided for USG-authorized use only. By using this IS (which includes any",
        "device attached to this IS), you consent to the following conditions: The",
        "USG routinely intercepts and monitors communications on this IS for purposes",
        "including, but not limited to, penetration testing, COMSEC monitoring, network",
        "operations and defense, personnel misconduct (PM), law enforcement (LE), and",
        "counterintelligence (CI) investigations. At any time, the USG may inspect and",
        "seize data stored on this IS. Communications using, or data stored on, this IS",
        "are not private, are subject to routine monitoring, interception, and search,",
        "and may be disclosed or used for any USG-authorized purpose. This IS includes",
        "security measures (e.g., authentication and access controls) to protect USG",
        "interests -- not for your personal benefit or privacy. Notwithstanding the",
        "above, using this IS does not constitute consent to PM, LE or CI investigative",
        "searching or monitoring of the content of privileged communications, or",
        "work product, related to personal representation or services by attorneys,",
        "psychotherapists, or clergy, and their assistants. Such communications and",
        "work product are private and confidential. See User Agreement for details."
    ]
}

TASK [Storage Readiness] *******************************************************
task path: /home/elfner/tmp.work/IPS4GRO/customers/[redacted]/[redacted]/[redacted]-storage-testing/k8s-storage-tests/main.yml:27
redirecting (type: modules) ansible.builtin.k8s to community.kubernetes.k8s

TASK [storage-readiness : Create namespace axtest if not present] **************
task path: /home/elfner/tmp.work/IPS4GRO/customers/[redacted]/[redacted]/[redacted]-storage-testing/k8s-storage-tests/roles/storage-readiness/tasks/main.yml:1
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: elfner
<127.0.0.1> EXEC /bin/sh -c 'echo ~elfner && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/elfner/.ansible/tmp `"&& mkdir "` echo /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228 `" && echo ansible-tmp-1696283789.3888855-596652-7944568623228="` echo /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228 `" ) && sleep 0'
redirecting (type: modules) ansible.builtin.k8s to community.kubernetes.k8s
Using module file /usr/local/lib/python3.6/site-packages/ansible_collections/community/kubernetes/plugins/modules/k8s.py
<127.0.0.1> PUT /home/elfner/.ansible/tmp/ansible-local-596605nq2fsp_8/tmp27racymx TO /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228/AnsiballZ_k8s.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228/ /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228/AnsiballZ_k8s.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3.6 /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228/AnsiballZ_k8s.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /home/elfner/.ansible/tmp/ansible-tmp-1696283789.3888855-596652-7944568623228/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
  File "/tmp/ansible_k8s_payload_cb6ancce/ansible_k8s_payload.zip/ansible_collections/community/kubernetes/plugins/module_utils/common.py", line 265, in get_api_client
    return DynamicClient(kubernetes.client.ApiClient(configuration))
  File "/usr/local/lib/python3.6/site-packages/openshift/dynamic/client.py", line 40, in __init__
    K8sDynamicClient.__init__(self, client, cache_file=cache_file, discoverer=discoverer)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/dynamic/client.py", line 84, in __init__
    self.__discoverer = discoverer(self, cache_file)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/dynamic/discovery.py", line 228, in __init__
    Discoverer.__init__(self, client, cache_file)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/dynamic/discovery.py", line 54, in __init__
    self.__init_cache()
  File "/usr/local/lib/python3.6/site-packages/kubernetes/dynamic/discovery.py", line 70, in __init_cache
    self._load_server_info()
  File "/usr/local/lib/python3.6/site-packages/openshift/dynamic/discovery.py", line 98, in _load_server_info
    'kubernetes': self.client.request('get', '/version', serializer=just_json)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/dynamic/client.py", line 55, in inner
    resp = func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/dynamic/client.py", line 283, in request
    _request_timeout=params.get('_request_timeout')
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 353, in call_api
    _preload_content, _request_timeout, _host)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 184, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 377, in request
    headers=headers)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 248, in GET
    query_params=query_params)
  File "/usr/local/lib/python3.6/site-packages/kubernetes/client/rest.py", line 221, in request
    headers=headers)
  File "/usr/lib/python3.6/site-packages/urllib3/request.py", line 68, in request
    **urlopen_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/request.py", line 89, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/poolmanager.py", line 324, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen
    **response_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen
    **response_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 667, in urlopen
    **response_kw)
  File "/usr/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3.6/site-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
fatal: [localhost]: FA[redacted]D! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "api_version": "v1",
            "append_hash": false,
            "apply": false,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "context": null,
            "force": false,
            "host": null,
            "kind": "Namespace",
            "kubeconfig": null,
            "merge_type": null,
            "name": "axtest",
            "namespace": null,
            "password": null,
            "persist_config": null,
            "proxy": null,
            "resource_definition": null,
            "src": null,
            "state": "present",
            "template": null,
            "username": null,
            "validate": null,
            "validate_certs": null,
            "wait": false,
            "wait_condition": null,
            "wait_sleep": 5,
            "wait_timeout": 120
        }
    },
    "msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1ee392c400>: Failed to establish a new connection: [Errno 111] Connection refused',))"
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0  
bxu1999 commented 1 year ago

It's not a problem to create the namespace first. If you do, that would simply return a message that states to something like the namespace already exists. The issue is when running the ansible-playbook, it can't invoke the client to run the command. Why is it using the port 80 either?

...
"msg": "Failed to get client due to HTTPConnectionPool(host='localhost', port=80): Max retries exceeded with url: /version (Caused by NewConnectionError
...

Can you check the followings?

The step failed is to create two PVCs, and does the user elfner have the permission to perform that? You can manually apply the following manifest file to verify: (replace those template variables with the values from the params.yml file)

---
apiVersion: v1
kind: Namespace
metadata:
  name: {{ storage_perf_namespace }}
spec: {}
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-sysbench-rwx
  namespace: "{{ storage_perf_namespace }}"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: {{ rwx_storagesize }}
  storageClassName: {{ storageClass_ReadWriteMany }}
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-sysbench-rwo
  namespace: "{{ storage_perf_namespace }}"
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: {{ rwo_storagesize }}
  storageClassName: {{ storageClass_ReadWriteOnce }}
elfner commented 1 year ago

oc client version 4.12.32 (at /usr/local/bin) Python 3.6.8 pip 21.3.1

I'm running from a RHEL 8.8 laptop, connecting to the cluster via the oc client. My laptop is not part of the remote cluster.

bxu1999 commented 1 year ago

Okay, I believe that this is the wrong project where you submitted this git issue. This project is for performance testing. From your logs, You were running the storage readiness tests, right? So we are working on two different projects at all.

So question: which project are you really working against? https://github.com/IBM/k8s-storage-tests # for storage readiness tests https://github.com/IBM/k8s-storage-perf # for storage performance

elfner commented 1 year ago

Yes, you are correct - I'm trying to run the performance tests. I apologize for my mistake.

bxu1999 commented 1 year ago
  1. Please run a perf test with -vvv option, and attach the logs
  2. I noticed one difference from our runs
    
    # in our runs, we have:
    <127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root

in your run,

<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: elfner


So can you sudo to root on your RHEL, and then run the test suite again?
elfner commented 1 year ago

As root, it seems it can't find the kubernetes Python library:

ansible-playbook [core 2.14.2]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.11/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /bin/ansible-playbook
  python version = 3.11.2 (main, Jun  6 2023, 07:39:01) [GCC 8.5.0 20210514 (Red Hat 8.5.0-18)] (/usr/bin/python3.11)
  jinja version = 3.1.2
  libyaml = True
Using /etc/ansible/ansible.cfg as config file
host_list declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
script declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
auto declined parsing /etc/ansible/hosts as it did not pass its verify_file() method
Parsed /etc/ansible/hosts inventory source with ini plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.

PLAYBOOK: main.yml *************************************************************
1 plays in main.yml

PLAY [localhost] ***************************************************************

TASK [ocp login using creds] ***************************************************
task path: /home/elfner/k8s-storage-tests/main.yml:8
skipping: [localhost] => {
    "changed": false,
    "skip_reason": "Conditional result was False"
}

TASK [ocp login using token] ***************************************************
task path: /home/elfner/k8s-storage-tests/main.yml:13
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603 `" && echo ansible-tmp-1696294049.3557808-619274-150561043835603="` echo /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603 `" ) && sleep 0'
Using module file /usr/lib/python3.11/site-packages/ansible/modules/command.py
<127.0.0.1> PUT /root/.ansible/tmp/ansible-local-619267jazixy8i/tmppvi6ilvd TO /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603/AnsiballZ_command.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603/ /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603/AnsiballZ_command.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3.11 /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603/AnsiballZ_command.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1696294049.3557808-619274-150561043835603/ > /dev/null 2>&1 && sleep 0'
changed: [localhost] => {
    "changed": true,
    "cmd": "oc login --server=https://api.[redacted]:6443 --token=sha256~YJOyx_pmbuqrCWd13RehreK86hxky8tni8rZEl5vc_c --insecure-skip-tls-verify=true",
    "delta": "0:00:01.183219",
    "end": "2023-10-02 17:47:30.724897",
    "invocation": {
        "module_args": {
            "_raw_params": "oc login --server=https://api.[redacted]:6443 --token=sha256~YJOyx_pmbuqrCWd13RehreK86hxky8tni8rZEl5vc_c --insecure-skip-tls-verify=true",
            "_uses_shell": true,
            "argv": null,
            "chdir": null,
            "creates": null,
            "executable": null,
            "removes": null,
            "stdin": null,
            "stdin_add_newline": true,
            "strip_empty_ends": true
        }
    },
    "msg": "",
    "rc": 0,
    "start": "2023-10-02 17:47:29.541678",
    "stderr": "",
    "stderr_lines": [],
    "stdout": "WARNING: Using insecure TLS client config. Setting this option is not supported!\n\nLogged into \"https://api.[redacted]:6443\" as \"aelfner\" using the token provided.\n\nYou have access to 88 projects, the list has been suppressed. You can list all projects with 'oc projects'\n\nUsing project \"default\".\n\nYou are accessing a U.S. Government (USG) Information System (IS) that is\nprovided for USG-authorized use only. By using this IS (which includes any\ndevice attached to this IS), you consent to the following conditions: The\nUSG routinely intercepts and monitors communications on this IS for purposes\nincluding, but not limited to, penetration testing, COMSEC monitoring, network\noperations and defense, personnel misconduct (PM), law enforcement (LE), and\ncounterintelligence (CI) investigations. At any time, the USG may inspect and\nseize data stored on this IS. Communications using, or data stored on, this IS\nare not private, are subject to routine monitoring, interception, and search,\nand may be disclosed or used for any USG-authorized purpose. This IS includes\nsecurity measures (e.g., authentication and access controls) to protect USG\ninterests -- not for your personal benefit or privacy. Notwithstanding the\nabove, using this IS does not constitute consent to PM, LE or CI investigative\nsearching or monitoring of the content of privileged communications, or\nwork product, related to personal representation or services by attorneys,\npsychotherapists, or clergy, and their assistants. Such communications and\nwork product are private and confidential. See User Agreement for details.\nWelcome! See 'oc help' to get started.",
    "stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.[redacted]:6443\" as \"aelfner\" using the token provided.",
        "",
        "You have access to 88 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"default\".",
        "",
        "You are accessing a U.S. Government (USG) Information System (IS) that is",
        "provided for USG-authorized use only. By using this IS (which includes any",
        "device attached to this IS), you consent to the following conditions: The",
        "USG routinely intercepts and monitors communications on this IS for purposes",
        "including, but not limited to, penetration testing, COMSEC monitoring, network",
        "operations and defense, personnel misconduct (PM), law enforcement (LE), and",
        "counterintelligence (CI) investigations. At any time, the USG may inspect and",
        "seize data stored on this IS. Communications using, or data stored on, this IS",
        "are not private, are subject to routine monitoring, interception, and search,",
        "and may be disclosed or used for any USG-authorized purpose. This IS includes",
        "security measures (e.g., authentication and access controls) to protect USG",
        "interests -- not for your personal benefit or privacy. Notwithstanding the",
        "above, using this IS does not constitute consent to PM, LE or CI investigative",
        "searching or monitoring of the content of privileged communications, or",
        "work product, related to personal representation or services by attorneys,",
        "psychotherapists, or clergy, and their assistants. Such communications and",
        "work product are private and confidential. See User Agreement for details.",
        "Welcome! See 'oc help' to get started."
    ]
}

TASK [debug] *******************************************************************
task path: /home/elfner/k8s-storage-tests/main.yml:18
skipping: [localhost] => {}

TASK [debug] *******************************************************************
task path: /home/elfner/k8s-storage-tests/main.yml:22
ok: [localhost] => {
    "login_token.stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.[redacted]:6443\" as \"aelfner\" using the token provided.",
        "",
        "You have access to 88 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"default\".",
        "",
        "You are accessing a U.S. Government (USG) Information System (IS) that is",
        "provided for USG-authorized use only. By using this IS (which includes any",
        "device attached to this IS), you consent to the following conditions: The",
        "USG routinely intercepts and monitors communications on this IS for purposes",
        "including, but not limited to, penetration testing, COMSEC monitoring, network",
        "operations and defense, personnel misconduct (PM), law enforcement (LE), and",
        "counterintelligence (CI) investigations. At any time, the USG may inspect and",
        "seize data stored on this IS. Communications using, or data stored on, this IS",
        "are not private, are subject to routine monitoring, interception, and search,",
        "and may be disclosed or used for any USG-authorized purpose. This IS includes",
        "security measures (e.g., authentication and access controls) to protect USG",
        "interests -- not for your personal benefit or privacy. Notwithstanding the",
        "above, using this IS does not constitute consent to PM, LE or CI investigative",
        "searching or monitoring of the content of privileged communications, or",
        "work product, related to personal representation or services by attorneys,",
        "psychotherapists, or clergy, and their assistants. Such communications and",
        "work product are private and confidential. See User Agreement for details.",
        "Welcome! See 'oc help' to get started."
    ]
}

TASK [Storage Readiness] *******************************************************
task path: /home/elfner/k8s-storage-tests/main.yml:27
redirecting (type: modules) ansible.builtin.k8s to kubernetes.core.k8s

TASK [storage-readiness : Create namespace axtest if not present] **************
task path: /home/elfner/k8s-storage-tests/roles/storage-readiness/tasks/main.yml:1
redirecting (type: modules) ansible.builtin.k8s to kubernetes.core.k8s
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197 `" && echo ansible-tmp-1696294050.861668-619308-115447879420197="` echo /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197 `" ) && sleep 0'
redirecting (type: modules) ansible.builtin.k8s to kubernetes.core.k8s
Using module file /root/.ansible/collections/ansible_collections/kubernetes/core/plugins/modules/k8s.py
<127.0.0.1> PUT /root/.ansible/tmp/ansible-local-619267jazixy8i/tmp2szvdg6k TO /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197/AnsiballZ_k8s.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197/ /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197/AnsiballZ_k8s.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3.11 /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197/AnsiballZ_k8s.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1696294050.861668-619308-115447879420197/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
  File "/tmp/ansible_k8s_payload_yrfrpe85/ansible_k8s_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/core.py", line 107, in requires
    requires(dependency, minimum, reason=reason)
  File "/tmp/ansible_k8s_payload_yrfrpe85/ansible_k8s_payload.zip/ansible_collections/kubernetes/core/plugins/module_utils/k8s/core.py", line 175, in requires
    raise Exception(missing_required_lib(lib, reason=reason))
fatal: [localhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "api_version": "v1",
            "append_hash": false,
            "apply": false,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "context": null,
            "continue_on_error": false,
            "delete_options": null,
            "force": false,
            "generate_name": null,
            "host": null,
            "impersonate_groups": null,
            "impersonate_user": null,
            "kind": "Namespace",
            "kubeconfig": null,
            "label_selectors": null,
            "merge_type": null,
            "name": "axtest",
            "namespace": null,
            "no_proxy": null,
            "password": null,
            "persist_config": null,
            "proxy": null,
            "proxy_headers": null,
            "resource_definition": null,
            "server_side_apply": null,
            "src": null,
            "state": "present",
            "template": null,
            "username": null,
            "validate": null,
            "validate_certs": null,
            "wait": false,
            "wait_condition": null,
            "wait_sleep": 5,
            "wait_timeout": 120
        }
    },
    "msg": "Failed to import the required Python library (kubernetes) on li-6bb8b34c-2459-11b2-a85c-9fced5441c5b.ibm.com's Python /usr/bin/python3.11. Please read the module documentation and install it in the appropriate location. If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter"
}

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0   
bxu1999 commented 1 year ago

Okay. After "sudo" to root, follow the README to perform all the steps.

bxu1999 commented 1 year ago

I figured out the root cause of this problem, and the solution is to update the overall project to use the latest Ansible collection - kubernetes.core.k8s.

Need some time to complete the updates as well as testing and validation. Thanks.

bxu1999 commented 1 year ago

Okay. With the git issue https://github.com/IBM/k8s-storage-perf/issues/24 being resolved, this issue is fixed. We have been testing with run this perf tests on a RHEL client host, and it works fine now.

Please refresh the project and try again.