389ds / 389-ds-base

The enterprise-class Open Source LDAP server for Linux
https://www.port389.org/
Other
210 stars 90 forks source link

Test replication/changelog_test.py fails with LMDB intermittently #6212

Open vashirov opened 3 months ago

vashirov commented 3 months ago

Issue Description

=========================== short test summary info ============================
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_dsconf_dump_changelog_files_removed
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_verify_changelog
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_verify_changelog_online_backup
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_verify_changelog_offline_backup
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_changelog_maxage
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_ticket47669_changelog_triminterval
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_retrochangelog_maxage
ERROR dirsrvtests/tests/suites/replication/changelog_test.py::test_retrochangelog_trimming_crash
=================== 3 skipped, 1 warning, 8 errors in 5.63s ====================

Tests error out during instance creation:

__________ ERROR at setup of test_dsconf_dump_changelog_files_removed __________

>       lambda: ihook(item=item, **kwds), when=when, reraise=reraise
    )

/usr/local/lib/python3.12/site-packages/flaky/flaky_pytest_plugin.py:146: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib/python3.12/site-packages/lib389/topologies.py:488: in topology_m2
    topology = create_topology({ReplicaRole.SUPPLIER: 2}, request=request)
/usr/lib/python3.12/site-packages/lib389/topologies.py:186: in create_topology
    topo = _create_instances(topo_dict, suffix)
/usr/lib/python3.12/site-packages/lib389/topologies.py:121: in _create_instances
    instance.create()
/usr/lib/python3.12/site-packages/lib389/__init__.py:860: in create
    self._createDirsrv(version)
/usr/lib/python3.12/site-packages/lib389/__init__.py:830: in _createDirsrv
    sds.create_from_args(general, slapd, backends, None)
/usr/lib/python3.12/site-packages/lib389/instance/setup.py:758: in create_from_args
    self._install_ds(general, slapd, backends)
/usr/lib/python3.12/site-packages/lib389/instance/setup.py:1037: in _install_ds
    ds_instance.start(timeout=60)
/usr/lib/python3.12/site-packages/lib389/__init__.py:1117: in start
    raise e from None
/usr/lib/python3.12/site-packages/lib389/__init__.py:1112: in start
    subprocess.check_output(["systemctl", "start", "dirsrv@%s" % self.serverid], stderr=subprocess.STDOUT)
/usr/lib64/python3.12/subprocess.py:466: in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input = None, capture_output = False, timeout = None, check = True
popenargs = (['systemctl', 'start', 'dirsrv@supplier2'],)
kwargs = {'stderr': -2, 'stdout': -1}
process = <Popen: returncode: 1 args: ['systemctl', 'start', 'dirsrv@supplier2']>
stdout = b'Job for dirsrv@supplier2.service failed because the control process exited with error code.\nSee "systemctl status dirsrv@supplier2.service" and "journalctl -xeu dirsrv@supplier2.service" for details.\n'
stderr = None, retcode = 1

    def run(*popenargs,
            input=None, capture_output=False, timeout=None, check=False, **kwargs):
        """Run command with arguments and return a CompletedProcess instance.

        The returned instance will have attributes args, returncode, stdout and
        stderr. By default, stdout and stderr are not captured, and those attributes
        will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them,
        or pass capture_output=True to capture both.

        If check is True and the exit code was non-zero, it raises a
        CalledProcessError. The CalledProcessError object will have the return code
        in the returncode attribute, and output & stderr attributes if those streams
        were captured.

        If timeout is given, and the process takes too long, a TimeoutExpired
        exception will be raised.

        There is an optional argument "input", allowing you to
        pass bytes or a string to the subprocess's stdin.  If you use this argument
        you may not also use the Popen constructor's "stdin" argument, as
        it will be used internally.

        By default, all communication is in bytes, and therefore any "input" should
        be bytes, and the stdout and stderr will be bytes. If in text mode, any
        "input" should be a string, and stdout and stderr will be strings decoded
        according to locale encoding, or by "encoding" if set. Text mode is
        triggered by setting any of text, encoding, errors or universal_newlines.

        The other arguments are the same as for the Popen constructor.
        """
        if input is not None:
            if kwargs.get('stdin') is not None:
                raise ValueError('stdin and input arguments may not both be used.')
            kwargs['stdin'] = PIPE

        if capture_output:
            if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                raise ValueError('stdout and stderr arguments may not be used '
                                 'with capture_output.')
            kwargs['stdout'] = PIPE
            kwargs['stderr'] = PIPE

        with Popen(*popenargs, **kwargs) as process:
            try:
                stdout, stderr = process.communicate(input, timeout=timeout)
            except TimeoutExpired as exc:
                process.kill()
                if _mswindows:
                    # Windows accumulates the output in a single blocking
                    # read() call run on child threads, with the timeout
                    # being done in a join() on those threads.  communicate()
                    # _after_ kill() is required to collect that and add it
                    # to the exception.
                    exc.stdout, exc.stderr = process.communicate()
                else:
                    # POSIX _communicate already populated the output so
                    # far into the TimeoutExpired exception.
                    process.wait()
                raise
            except:  # Including KeyboardInterrupt, communicate handled that.
                process.kill()
                # We don't call process.wait() as .__exit__ does that for us.
                raise
            retcode = process.poll()
            if check and retcode:
>               raise CalledProcessError(retcode, process.args,
                                         output=stdout, stderr=stderr)
E               subprocess.CalledProcessError: Command '['systemctl', 'start', 'dirsrv@supplier2']' returned non-zero exit status 1.

/usr/lib64/python3.12/subprocess.py:571: CalledProcessError

There is also a warning that should be fixed, as the comparison is incorrect:

dirsrvtests/tests/suites/replication/changelog_test.py:52
  /workspace/dirsrvtests/tests/suites/replication/changelog_test.py:52: SyntaxWarning: "is" with 'str' literal. Did you mean "=="?
    if instance.get_db_lib() is 'bdb':
vashirov commented 3 months ago

Another test fails similarly

ERROR dirsrvtests/tests/suites/password/password_TPR_policy_test.py::test_TPR_replication_entry

https://github.com/389ds/389-ds-base/actions/runs/9444641584/job/26011110324#step:7:639

vashirov commented 3 months ago