IBM / k8s-storage-perf

This git repo will host the playbooks for collecting performance metrics for a Kubernetes persistent storage for IBM Cloud Paks
Apache License 2.0
9 stars 18 forks source link

Problems launching test in CP4DS cluster #11

Closed cwiering closed 12 months ago

cwiering commented 1 year ago

We seem to have a problem to launch the storage test right at the start when trying to login to OCP cluster. Manually logging into cluster works, ansible script has a problem. (supressed IDs and PWs here)

Manual login : [root@e1n1 ~]# oc login https://api.localcluster.fbond:6443/ -u kubeadmin -p ... --insecure-skip-tls-verify=true Login successful.

You have access to 117 projects, the list has been suppressed. You can list all projects with 'oc projects'

Using project "zen". [root@e1n1 ~]#

Ansible : [root@e1n1 ~]# run_k8s_storage_perf

PLAY [localhost] ***

TASK [ocp login using creds] *** fatal: [localhost]: FAILED! => {"changed": true, "cmd": "oc login https://api.localcluster.fbond:6443/ -u kubeadmin -p ... --insecure-skip-tls-verify=true", "delta": "0:00:00.099080", "end": "2023-01-10 17:45:23.952566", "msg": "non-zero return code", "rc": 1, "start": "2023-01-10 17:45:23.853486", "stderr": "error: dial tcp: lookup api.localcluster.fbond on x.x.x.x:53: server misbehaving - verify you have provided the correct host and port and that the server is currently running.", "stderr_lines": ["error: dial tcp: lookup api.localcluster.fbond on x.x.x.x:53: server misbehaving - verify you have provided the correct host and port and that the server is currently running."], "stdout": "", "stdout_lines": []}

PLAY RECAP ***** localhost : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

[root@e1n1 ~]#

We have a "disconnected environment" as Red Hat calls it. 8 node cluster, 3 master and 5 worker nodes. We have CP4DS 2.0.2.1 with CP4D 4.5.3 with ODF storage running on worker nodes.

cwiering commented 1 year ago

This is how it looked like when container was running (no port ?!) : [root@e1n1 ~]# podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f3d93dd535c2 docker.io/library/registry:2.7 /etc/docker/regis... 7 weeks ago Up 7 weeks ago 0.0.0.0:5000->5000/tcp mirror-registry 02c79883b06c icr.io/cpopen/cpd/k8s-storage-perf:v1.0.0 17 hours ago Up 17 hours ago k8s-storage-perf

bxu1999 commented 1 year ago
  1. Is this run performed in an airgap OCP cluster?
  2. Is the run kicked off from a pipeline job?
  3. Can you follow the README file and manually run it to see if it works or not?
cwiering commented 1 year ago
  1. "disconnected" = airgap, so YES
  2. Because I'm not 100% sure, what a "pipeline job" is, I would say no :o|
  3. Could you point me to the README file please ?
bxu1999 commented 12 months ago

Hi @cwiering so this is a CPDS/Yosemite cluster, right? Those clusters have special OCP login method as far as I know, and we are not sure the existing supported OCP logins from this project would work. We have to test it first. Thanks.

bxu1999 commented 12 months ago

Using the token method to perform OCP login to CPDS cluster should work, and I tested and it works fine as below.