red-hat-storage / ocs-ci

https://ocs-ci.readthedocs.io/en/latest/
MIT License
109 stars 166 forks source link

test_rgw_host_node_failure failing on IBM Power #8369

Open Pooja-Soni78 opened 1 year ago

Pooja-Soni78 commented 1 year ago

tests/manage/rgw/test_host_node_failure.py::TestRGWAndNoobaaDBHostNodeFailure::test_rgw_host_node_failure is failing on IBM Power with below error-

Error is Error from server (NotFound): pods "rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7d4wdf" not found

Pooja-Soni78 commented 1 year ago
cmd = ['oc', '--kubeconfig', '/root/openstack-upi/auth/kubeconfig', '-n', 'openshift-storage', 'describe', ...]
secrets = None, timeout = 600, ignore_error = False, threading_lock = None
silent = False, use_shell = False
cluster_config = <ocs_ci.framework.MultiClusterConfig object at 0x7fffa54a6a90>
kwargs = {}
masked_cmd = 'oc -n openshift-storage describe Pod rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7d4wdf'
kubepath = '/root/openstack-upi/auth/kubeconfig'
completed_process = CompletedProcess(args=['oc', '--kubeconfig', '/root/openstack-upi/auth/kubeconfig', '-n', 'openshift-storage', 'descri...rr=b'Error from server (NotFound): pods "rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7d4wdf" not found\n')
masked_stdout = ''
masked_stderr = 'Error from server (NotFound): pods "rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7d4wdf" not found\n'

    def exec_cmd(
        cmd,
        secrets=None,
        timeout=600,
        ignore_error=False,
        threading_lock=None,
        silent=False,
        use_shell=False,
        cluster_config=None,
        **kwargs,
    ):
        """
        Run an arbitrary command locally

        If the command is grep and matching pattern is not found, then this function
        returns "command terminated with exit code 1" in stderr.

        Args:
            cmd (str): command to run
            secrets (list): A list of secrets to be masked with asterisks
                This kwarg is popped in order to not interfere with
                subprocess.run(``**kwargs``)
            timeout (int): Timeout for the command, defaults to 600 seconds.
            ignore_error (bool): True if ignore non zero return code and do not
                raise the exception.
            threading_lock (threading.Lock): threading.Lock object that is used
                for handling concurrent oc commands
            silent (bool): If True will silent errors from the server, default false
            use_shell (bool): If True will pass the cmd without splitting
            cluster_config (MultiClusterConfig): In case of multicluster environment this object
                    will be non-null

        Raises:
            CommandFailed: In case the command execution fails

        Returns:
            (CompletedProcess) A CompletedProcess object of the command that was executed
            CompletedProcess attributes:
            args: The list or str args passed to run().
            returncode (str): The exit code of the process, negative for signals.
            stdout     (str): The standard output (None if not captured).
            stderr     (str): The standard error (None if not captured).

        """
        masked_cmd = mask_secrets(cmd, secrets)
        log.info(f"Executing command: {masked_cmd}")
        if isinstance(cmd, str) and not kwargs.get("shell"):
            cmd = shlex.split(cmd)
        if cluster_config and cmd[0] == "oc" and "--kubeconfig" not in cmd:
            kubepath = cluster_config.RUN["kubeconfig"]
            cmd = list_insert_at_position(cmd, 1, ["--kubeconfig"])
            cmd = list_insert_at_position(cmd, 2, [kubepath])
        if threading_lock and cmd[0] == "oc":
            threading_lock.acquire()
        completed_process = subprocess.run(
            cmd,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            stdin=subprocess.PIPE,
            timeout=timeout,
            **kwargs,
        )
        if threading_lock and cmd[0] == "oc":
            threading_lock.release()
        masked_stdout = mask_secrets(completed_process.stdout.decode(), secrets)
        if len(completed_process.stdout) > 0:
            log.debug(f"Command stdout: {masked_stdout}")
        else:
            log.debug("Command stdout is empty")

        masked_stderr = mask_secrets(completed_process.stderr.decode(), secrets)
        if len(completed_process.stderr) > 0:
            if not silent:
                log.warning(f"Command stderr: {masked_stderr}")
        else:
            log.debug("Command stderr is empty")
        log.debug(f"Command return code: {completed_process.returncode}")
        if completed_process.returncode and not ignore_error:
            if (
                "grep" in masked_cmd
                and b"command terminated with exit code 1" in completed_process.stderr
            ):
                log.info(f"No results found for grep command: {masked_cmd}")
            else:
>               raise CommandFailed(
                    f"Error during execution of command: {masked_cmd}."
                    f"\nError is {masked_stderr}"
                )
E               ocs_ci.ocs.exceptions.CommandFailed: Error during execution of command: oc -n openshift-storage describe Pod rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7d4wdf.
E               Error is Error from server (NotFound): pods "rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7d4wdf" not found

ocs_ci/utility/utils.py:659: CommandFailed
=============================== warnings summary ===============================
tests/manage/rgw/test_host_node_failure.py::TestRGWAndNoobaaDBHostNodeFailure::test_rgw_host_node_failure
tests/manage/rgw/test_host_node_failure.py::TestRGWAndNoobaaDBHostNodeFailure::test_rgw_host_node_failure
  /home/pooja/venv/lib64/python3.9/site-packages/urllib3/connectionpool.py:981: InsecureRequestWarning: Unverified HTTPS request is being made to host 'prometheus-k8s-openshift-monitoring.apps.pooj-414-odf.ibm.com'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ============================
FAILED tests/manage/rgw/test_host_node_failure.py::TestRGWAndNoobaaDBHostNodeFailure::test_rgw_host_node_failure
ERROR tests/manage/rgw/test_host_node_failure.py::TestRGWAndNoobaaDBHostNodeFailure::test_rgw_host_node_failure
============= 1 failed, 2 warnings, 1 error in 7776.42s (2:09:36) ==============
# oc get pods
NAME                                                              READY   STATUS    RESTARTS       AGE
csi-addons-controller-manager-598bc65cb6-s6sxl                    2/2     Running   0              14m
csi-cephfsplugin-6pvnm                                            2/2     Running   0              4d1h
csi-cephfsplugin-b42qb                                            2/2     Running   4              4d1h
csi-cephfsplugin-provisioner-7845bdfc95-7q86s                     5/5     Running   0              27h
csi-cephfsplugin-provisioner-7845bdfc95-9svrf                     5/5     Running   0              125m
csi-cephfsplugin-w6zvf                                            2/2     Running   0              4d1h
csi-rbdplugin-6p4nd                                               3/3     Running   0              4d1h
csi-rbdplugin-6tztx                                               3/3     Running   6              4d1h
csi-rbdplugin-gz9q6                                               3/3     Running   0              4d1h
csi-rbdplugin-provisioner-84f8ffbd45-7tchh                        6/6     Running   0              125m
csi-rbdplugin-provisioner-84f8ffbd45-8lnnc                        6/6     Running   0              27h
noobaa-core-0                                                     1/1     Running   0              130m
noobaa-db-pg-0                                                    1/1     Running   0              130m
noobaa-endpoint-84fff5796-2m8gb                                   1/1     Running   0              130m
noobaa-endpoint-84fff5796-p8zv2                                   1/1     Running   0              28h
noobaa-operator-6f4fcf46c7-rb7rk                                  2/2     Running   0              28h
ocs-metrics-exporter-f5cd5965c-dzvd4                              1/1     Running   0              125m
ocs-operator-5b648dbd4b-57wzv                                     1/1     Running   2 (124m ago)   125m
odf-console-6bf6f6cf5-gp98x                                       1/1     Running   0              27h
odf-operator-controller-manager-58664cf578-j6nfk                  2/2     Running   0              27h
rook-ceph-crashcollector-syd05-worker-0.pooj-414-odf.ibm.cb8tmj   1/1     Running   0              16m
rook-ceph-crashcollector-syd05-worker-1.pooj-414-odf.ibm.cnpchn   1/1     Running   0              130m
rook-ceph-crashcollector-syd05-worker-2.pooj-414-odf.ibm.cvrwzh   1/1     Running   0              27h
rook-ceph-exporter-syd05-worker-0.pooj-414-odf.ibm.com-dbd6v6kz   1/1     Running   0              16m
rook-ceph-exporter-syd05-worker-1.pooj-414-odf.ibm.com-75765xsp   1/1     Running   0              130m
rook-ceph-exporter-syd05-worker-2.pooj-414-odf.ibm.com-6697qzll   1/1     Running   1 (27h ago)    27h
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-69c8f778brfpn   2/2     Running   0              130m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-69c75bb4cfd7x   2/2     Running   0              27h
rook-ceph-mgr-a-7f8955589d-h78wt                                  2/2     Running   0              130m
rook-ceph-mon-h-66bc4f587c-jll9r                                  2/2     Running   0              22h
rook-ceph-mon-i-857b8579bf-dwtpk                                  2/2     Running   0              29h
rook-ceph-mon-j-74c4984fcb-q2tfm                                  2/2     Running   0              16m
rook-ceph-operator-5dc4494ff5-c89gp                               1/1     Running   0              22h
rook-ceph-osd-0-bb8444ffd-d4hf2                                   2/2     Running   0              29h
rook-ceph-osd-1-5dfc78b679-rvq5x                                  2/2     Running   0              125m
rook-ceph-osd-2-84556f8bbf-wlbgh                                  2/2     Running   0              22h
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-5dfbcb7hhtj5   2/2     Running   0              130m
rook-ceph-tools-9f6fc4cb7-26vbb                                   1/1     Running   0              125m

logs - test_rgw_host_node_failure-1.log

github-actions[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs.

github-actions[bot] commented 8 months ago

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

Pooja-Soni78 commented 7 months ago

This test case failed in 4.15 as well.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs.

Shilpi-Das1 commented 3 months ago

This test case failed for 4.16 again with same error

tests_functional_object_rgw_test_host_node_failure.py.log

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs.