red-hat-storage / ocs-ci

https://ocs-ci.readthedocs.io/en/latest/
MIT License
108 stars 168 forks source link

test_unidirectional_bucket_replication failed with AssertionError: azure-ns-store-250e85500a5a44d28ca60ade7 did not reach a healthy state within 180 seconds. #6737

Closed ebenahar closed 1 year ago

ebenahar commented 1 year ago

Run details:

URL: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#OCS/launches/362/7364/320363/320416/320418/log Run ID: 1670509372 Test Case: test_unidirectional_bucket_replication ODF Build: 4.12.0-130 OCP Version: 4.12 Job name: IBM Cloud IPI 3AZ RHCOS 3M 3W tier1 Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/6343/ Logs URL: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-005ici3c33-t1/j-005ici3c33-t1_20221208T130227/logs/

Failure Details:

Message: AssertionError: azure-ns-store-250e85500a5a44d28ca60ade7 did not reach a healthy state within 180 seconds.
Type: None

Text:
self = <ocs_ci.ocs.resources.namespacestore.NamespaceStore object at 0x7f91e231c1f0>
timeout = 180, interval = 5

    def verify_health(self, timeout=180, interval=5):
        """
        Health verification function that tries to verify
        a namespacestores's health until a given time limit is reached

        Args:
            timeout (int): Timeout for the check, in seconds
            interval (int): Interval to wait between checks, in seconds

        Returns:
            (bool): True if the bucket is healthy, False otherwise

        """
        log.info(f"Waiting for {self.name} to be healthy")
        try:
>           for health_check in TimeoutSampler(
                timeout, interval, getattr(self, f"{self.method}_verify_health")
            ):

ocs_ci/ocs/resources/namespacestore.py:174: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ocs_ci.utility.utils.TimeoutSampler object at 0x7f91e231cd60>

    def __iter__(self):
        if self.start_time is None:
            self.start_time = time.time()
        while True:
            self.last_sample_time = time.time()
            if self.timeout <= (self.last_sample_time - self.start_time):
>               raise self.timeout_exc_cls(*self.timeout_exc_args)
E               ocs_ci.ocs.exceptions.TimeoutExpiredError: Timed out after 180s running oc_verify_health()

ocs_ci/utility/utils.py:1173: TimeoutExpiredError

During handling of the above exception, another exception occurred:

self = <tests.manage.mcg.test_bucket_replication.TestReplication object at 0x7f91e5ec6ca0>
awscli_pod_session = <ocs_ci.ocs.resources.pod.Pod object at 0x7f91e61dca30>
mcg_obj_session = <ocs_ci.ocs.resources.mcg.MCG object at 0x7f91e0e37910>
bucket_factory = <function bucket_factory_fixture.<locals>._create_buckets at 0x7f92023b9820>
source_bucketclass = {'interface': 'OC', 'namespace_policy_dict': {'namespacestore_dict': {'azure': [(1, None)]}, 'type': 'Single'}}
target_bucketclass = {'backingstore_dict': {'gcp': [(1, None)]}, 'interface': 'CLI'}

    @pytest.mark.parametrize(
        argnames=["source_bucketclass", "target_bucketclass"],
        argvalues=[
            pytest.param(
                {
                    "interface": "OC",
                    "backingstore_dict": {"aws": [(1, "eu-central-1")]},
                },
                {"interface": "OC", "backingstore_dict": {"azure": [(1, None)]}},
                marks=[tier1, pytest.mark.polarion_id("OCS-2678")],
            ),
            pytest.param(
                {
                    "interface": "OC",
                    "backingstore_dict": {"gcp": [(1, None)]},
                },
                {
                    "interface": "OC",
                    "backingstore_dict": {"aws": [(1, "eu-central-1")]},
                },
                marks=[tier2],
            ),
            pytest.param(
                {
                    "interface": "CLI",
                    "backingstore_dict": {"azure": [(1, None)]},
                },
                {"interface": "CLI", "backingstore_dict": {"gcp": [(1, None)]}},
                marks=[tier2],
            ),
            pytest.param(
                {
                    "interface": "CLI",
                    "backingstore_dict": {"aws": [(1, "eu-central-1")]},
                },
                {"interface": "CLI", "backingstore_dict": {"azure": [(1, None)]}},
                marks=[tier1, pytest.mark.polarion_id("OCS-2679")],
            ),
            pytest.param(
                {
                    "interface": "OC",
                    "namespace_policy_dict": {
                        "type": "Single",
                        "namespacestore_dict": {"aws": [(1, "eu-central-1")]},
                    },
                },
                {
                    "interface": "OC",
                    "namespace_policy_dict": {
                        "type": "Single",
                        "namespacestore_dict": {"azure": [(1, None)]},
                    },
                },
                marks=[tier2],
            ),
            pytest.param(
                {
                    "interface": "OC",
                    "namespace_policy_dict": {
                        "type": "Single",
                        "namespacestore_dict": {"azure": [(1, None)]},
                    },
                },
                {
                    "interface": "CLI",
                    "backingstore_dict": {"gcp": [(1, None)]},
                },
                marks=[tier1],
            ),
        ],
        ids=[
            "AWStoAZURE-BS-OC",
            "GCPtoAWS-BS-OC",
            "AZUREtoCGP-BS-CLI",
            "AWStoAZURE-BS-CLI",
            "AWStoAZURE-NS-OC",
            "AZUREtoGCP-NS-Hybrid",
        ],
    )
    def test_unidirectional_bucket_replication(
        self,
        awscli_pod_session,
        mcg_obj_session,
        bucket_factory,
        source_bucketclass,
        target_bucketclass,
    ):
        """
        Test unidirectional bucket replication using CLI and YAML by adding objects
        to a backingstore- and namespacestore-backed buckets

        """
        target_bucket_name = bucket_factory(bucketclass=target_bucketclass)[0].name
        replication_policy = ("basic-replication-rule", target_bucket_name, None)
>       source_bucket_name = bucket_factory(
            1, bucketclass=source_bucketclass, replication_policy=replication_policy
        )[0].name

tests/manage/mcg/test_bucket_replication.py:123: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/conftest.py:2557: in _create_buckets
    bucketclass if bucketclass is None else bucket_class_factory(bucketclass)
ocs_ci/ocs/resources/bucketclass.py:129: in _create_bucket_class
    namespacestores = namespace_store_factory(interface, nss_dict)
ocs_ci/ocs/resources/namespacestore.py:437: in _create_nss
    nss_obj.verify_health()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <ocs_ci.ocs.resources.namespacestore.NamespaceStore object at 0x7f91e231c1f0>
timeout = 180, interval = 5

    def verify_health(self, timeout=180, interval=5):
        """
        Health verification function that tries to verify
        a namespacestores's health until a given time limit is reached

        Args:
            timeout (int): Timeout for the check, in seconds
            interval (int): Interval to wait between checks, in seconds

        Returns:
            (bool): True if the bucket is healthy, False otherwise

        """
        log.info(f"Waiting for {self.name} to be healthy")
        try:
            for health_check in TimeoutSampler(
                timeout, interval, getattr(self, f"{self.method}_verify_health")
            ):
                if health_check:
                    log.info(f"{self.name} is healthy")
                    return True
                else:
                    log.info(f"{self.name} is unhealthy. Rechecking.")
        except TimeoutExpiredError:
            log.error(
                f"{self.name} did not reach a healthy state within {timeout} seconds."
            )
>           assert (
                False
            ), f"{self.name} did not reach a healthy state within {timeout} seconds."
E           AssertionError: azure-ns-store-250e85500a5a44d28ca60ade7 did not reach a healthy state within 180 seconds.

ocs_ci/ocs/resources/namespacestore.py:186: AssertionError
ebenahar commented 1 year ago

This issue was encountered once again. Run details:

URL: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#OCS/launches/362/7015/292448/292491/292496/log Run ID: 1670007275 Test Case: test_unidirectional_bucket_replication ODF Build: 4.12.0-120 OCP Version: 4.12 Job name: AZURE IPI 3AZ RHCOS 3M 3W tier1 or tier_after_upgrade post upgrade Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/6280/ Logs URL: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-004zi3c33-uba/j-004zi3c33-uba_20221202T083924/logs/ Error Message: AssertionError: azure-ns-store-bba03dee2c7444969f52df093 did not reach a healthy state within 180 seconds.

ebenahar commented 1 year ago

This issue was encountered once again. Run details:

URL: https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#OCS/launches/362/6998/291593/291640/291644/log Run ID: 1669980540 Test Case: test_unidirectional_bucket_replication ODF Build: 4.12.0-122 OCP Version: 4.12 Job name: BAREMETAL UPI 1AZ RHCOS NVME INTEL COMPACT MODE 3M 0W tier2 Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/6270/ Logs URL: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-003bu1cni30-t2/j-003bu1cni30-t2_20221202T102339/logs/ Error Message: AssertionError: azure-ns-store-1a04824dfce8498798e43e3e0 did not reach a healthy state within 180 seconds.

sagihirshfeld commented 1 year ago

These failures were classified as non-automation issues on the report portal.