red-hat-storage / ocs-ci

https://ocs-ci.readthedocs.io/en/latest/
MIT License
108 stars 166 forks source link

Setting up csi-kms-connection-details configmap failed on VSPHERE6-UPI-Proxy-1AZ-RHCOS-VSAN-3M-3W #9479

Closed vkathole closed 2 months ago

vkathole commented 7 months ago

Testcases failing: test_encrypted_pvc_clone[v1] - https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/632/18271/891112/891212/log test_encrypted_pvc_clone[v2] - https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/632/18271/891112/891214/log test_encrypted_pvc_snapshot[v1] - https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/632/18271/891112/891218/log test_encrypted_pvc_snapshot[v2] - https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/632/18271/891112/891220/log test_encrypted_compressed_sc_and_support_snap_clone[v1-3-aggressive] - https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/632/18271/891112/891246/log test_encrypted_compressed_sc_and_support_snap_clone[v2-3-aggressive] - https://reportportal-ocs4.apps.ocp-c1.prod.psi.redhat.com/ui/#ocs/launches/632/18271/891112/891248/log

vkathole commented 7 months ago

Log Message: `self = <test_compressed_sc_and_support_snap_clone.TestCompressedSCAndSupportSnapClone object at 0x7f4827d9b460> kv_version = 'v1', replica = 3, compression = 'aggressive' pv_encryption_kms_setup_factory = <function pv_encryption_vault_setup_factory..factory at 0x7f48553d2e50> storageclass_factory = <function storageclass_factory_fixture..factory at 0x7f485b1e54c0> pgsql_factory_fixture = <function pgsql_factory_fixture..factory at 0x7f4855088040> multi_snapshot_factory = <function multi_snapshot_factory..factory at 0x7f4835ffe3a0> multi_snapshot_restore_factory = <function multi_snapshot_restore_factory..factory at 0x7f4835ffedc0> multi_pvc_clone_factory = <function multi_pvc_clone_factory..factory at 0x7f4833097dc0> pgsql_teardown = None

@skipif_external_mode @skipif_ocs_version("<4.9") @skipif_ocp_version("<4.9") @pytest.mark.parametrize( argnames=["kv_version", "replica", "compression"], argvalues=[ pytest.param( "v1", 3, "aggressive", marks=pytest.mark.polarion_id("OCS-2707") ), pytest.param( "v2", 3, "aggressive", marks=pytest.mark.polarion_id("OCS-2712") ), ], ) def test_encrypted_compressed_sc_and_support_snap_clone( self, kv_version, replica, compression, pv_encryption_kms_setup_factory, storageclass_factory, pgsql_factory_fixture, multi_snapshot_factory, multi_snapshot_restore_factory, multi_pvc_clone_factory, pgsql_teardown, ): """

  1. Create new sc with compression and encryption enabled
  2. Deploy PGSQL workload using those new sc created
  3. Take a snapshot of the pgsql PVC.
  4. Create a new PVC out of that snapshot or restore snapshot
  5. Attach a new pgsql pod to it.
  6. Resize the new PVC
  7. Clone pgsql PVC and attach a new pgsql pod to it
  8. Resize cloned PVC """ pgsql_teardown

    log.info("Setting up csi-kms-connection-details configmap")

    self.vault = pv_encryption_kms_setup_factory(kv_version)

tests/functional/workloads/pvc_snapshot_and_clone/test_compressed_sc_and_support_snap_clone.py:206:

tests/conftest.py:4939: in factory vault.vault_create_backend_path( ocs_ci/utility/kms.py:504: in vault_create_backend_path if self.vault_backend_path_exists(self.vault_backend_path): ocs_ci/utility/kms.py:244: in vault_backend_path_exists out = subprocess.check_output(shlex.split(cmd)) /usr/lib64/python3.8/subprocess.py:415: in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,

input = None, capture_output = False, timeout = None, check = True popenargs = (['vault', 'secrets', 'list', '--format=json'],) kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f48351e1130> stdout = b'', stderr = None, retcode = 2

def run(*popenargs, input=None, capture_output=False, timeout=None, check=False, **kwargs): """Run command with arguments and return a CompletedProcess instance.

The returned instance will have attributes args, returncode, stdout and
stderr. By default, stdout and stderr are not captured, and those attributes
will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

If check is True and the exit code was non-zero, it raises a
CalledProcessError. The CalledProcessError object will have the return code
in the returncode attribute, and output & stderr attributes if those streams
were captured.

If timeout is given, and the process takes too long, a TimeoutExpired
exception will be raised.

There is an optional argument "input", allowing you to
pass bytes or a string to the subprocess's stdin.  If you use this argument
you may not also use the Popen constructor's "stdin" argument, as
it will be used internally.

By default, all communication is in bytes, and therefore any "input" should
be bytes, and the stdout and stderr will be bytes. If in text mode, any
"input" should be a string, and stdout and stderr will be strings decoded
according to locale encoding, or by "encoding" if set. Text mode is
triggered by setting any of text, encoding, errors or universal_newlines.

The other arguments are the same as for the Popen constructor.
"""
if input is not None:
    if kwargs.get('stdin') is not None:
        raise ValueError('stdin and input arguments may not both be used.')
    kwargs['stdin'] = PIPE

if capture_output:
    if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
        raise ValueError('stdout and stderr arguments may not be used '
                         'with capture_output.')
    kwargs['stdout'] = PIPE
    kwargs['stderr'] = PIPE

with Popen(*popenargs, **kwargs) as process:
    try:
        stdout, stderr = process.communicate(input, timeout=timeout)
    except TimeoutExpired as exc:
        process.kill()
        if _mswindows:
            # Windows accumulates the output in a single blocking
            # read() call run on child threads, with the timeout
            # being done in a join() on those threads.  communicate()
            # _after_ kill() is required to collect that and add it
            # to the exception.
            exc.stdout, exc.stderr = process.communicate()
        else:
            # POSIX _communicate already populated the output so
            # far into the TimeoutExpired exception.
            process.wait()
        raise
    except:  # Including KeyboardInterrupt, communicate handled that.
        process.kill()
        # We don't call process.wait() as .__exit__ does that for us.
        raise
    retcode = process.poll()
    if check and retcode:

      raise CalledProcessError(retcode, process.args,
                                 output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['vault', 'secrets', 'list', '--format=json']' returned non-zero exit status 2.

/usr/lib64/python3.8/subprocess.py:516: CalledProcessError`

vkathole commented 7 months ago

Same testcases are failing on following message as well

`self = <tests.functional.pv.pvc_snapshot.test_pgsql_pvc_snapshot.TestPvcSnapshotOfWorkloads object at 0x7f20b4156250> kv_version = 'v1' pv_encryption_kms_setup_factory = <function pv_encryption_vault_setup_factory..factory at 0x7f20e60f6550> storageclass_factory = <function storageclass_factory_fixture..factory at 0x7f20bc199dc0> pgsql_factory_fixture = <function pgsql_factory_fixture..factory at 0x7f20bc199430> snapshot_factory = <function snapshot_factory_fixture..factory at 0x7f20e5d90e50> snapshot_restore_factory = <function snapshot_restore_factory_fixture..factory at 0x7f20e5d90af0> pgsql_teardown = None

@pytest.mark.parametrize( argnames=["kv_version"], argvalues=[ pytest.param("v1", marks=pytest.mark.polarion_id("OCS-2713")), pytest.param("v2", marks=pytest.mark.polarion_id("OCS-2714")), ], ) @skipif_ocs_version("<4.8") @skipif_ocp_version("<4.8") @skipif_hci_provider_and_client def test_encrypted_pvc_snapshot( self, kv_version, pv_encryption_kms_setup_factory, storageclass_factory, pgsql_factory_fixture, snapshot_factory, snapshot_restore_factory, pgsql_teardown, ): """

  1. Create encrypted storage class
  2. Deploy PGSQL workload using created sc
  3. Take a snapshot of the pgsql PVC.
  4. Create a new PVC out of that snapshot or restore snapshot
  5. Attach a new pgsql pod to it.
  6. Create pgbench benchmark to new pgsql pod
  7. Verify if key is created

    """ pgsql_teardown

    log.info("Setting up csi-kms-connection-details configmap") self.vault = pv_encryption_kms_setup_factory(kv_version) log.info("csi-kms-connection-details setup successful")

    Create an encryption enabled storageclass for RBD

    self.sc_obj = storageclass_factory( interface=CEPHBLOCKPOOL, encrypted=True, encryption_kms_id=self.vault.kmsid, )

    Create ceph-csi-kms-token in the tenant namespace

    self.vault.vault_path_token = self.vault.generate_vault_token() self.vault.create_vault_csi_kms_token(namespace=BMO_NAME)

    Deploy PGSQL workload

    log.info("Deploying pgsql workloads")

    pgsql = pgsql_factory_fixture(replicas=1, sc_name=self.sc_obj.name)

tests/functional/pv/pvc_snapshot/test_pgsql_pvc_snapshot.py:167:

tests/conftest.py:3317: in factory pgsql.setup_postgresql(replicas=replicas, sc_name=sc_name) ocs_ci/ocs/pgsql.py:92: in setup_postgresql self.pod_obj.wait_for_resource( ocs_ci/ocs/ocp.py:809: in wait_for_resource raise (ex) ocs_ci/ocs/ocp.py:700: in wait_for_resource for sample in TimeoutSampler(

self = <ocs_ci.utility.utils.TimeoutSampler object at 0x7f20e5391cd0>

def iter(self): if self.start_time is None: self.start_time = time.time() while True: self.last_sample_time = time.time() if self.timeout <= (self.last_sample_time - self.start_time):

      raise self.timeout_exc_cls(*self.timeout_exc_args)

E ocs_ci.ocs.exceptions.TimeoutExpiredError: Timed out after 3600s running get("", True, "app=postgres")

ocs_ci/utility/utils.py:1310: TimeoutExpiredError`

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs.

vkathole commented 2 months ago

Closing this as I didn't found this issue in recent runs. Will reopen if required