microsoft / Qcodes

Modular data acquisition framework
http://microsoft.github.io/Qcodes/
MIT License
352 stars 318 forks source link

Random failure in do_nd test #5551

Closed jenshnielsen closed 8 months ago

jenshnielsen commented 1 year ago

Looks like some test may be leaking a non closed awg instrument

=================================== FAILURES ===================================
______________________________ test_datasaver_1d _______________________________
[gw1] linux -- Python 3.10.13 /opt/hostedtoolcache/Python/3.10.13/x64/bin/python

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw1/test_datasaver_1d0/temp.db
--------------------...1-dummy_dac_ch1,dummy_dmm_v1-31
12-results-12-dummy_dac_ch1,dummy_dmm_v1-[76](https://github.com/QCoDeS/Qcodes/actions/runs/6974993548/job/18987846779?pr=5547#step:9:77)
13-results-13-dummy_dac_ch1,dummy_dmm_v1-76
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7f1ce0b3beb0>
n_points = 76

    @given(n_points=hst.integers(min_value=1, max_value=[100](https://github.com/QCoDeS/Qcodes/actions/runs/6974993548/job/18987846779?pr=5547#step:9:101)))
    @example(n_points=5)
    @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
    def test_datasaver_1d(
        experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
    ) -> None:
        meas = Measurement()
        meas.register_parameter(DAC.ch1)
        meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))

        n_points_expected = 5

        meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})

        with meas.run() as datasaver:

            for set_v in np.linspace(0, 1, n_points):
                DAC.ch1()
                datasaver.add_result((DAC.ch1, set_v),
                                     (DMM.v1, DMM.v1()))

        ds = datasaver.dataset
        caplog.clear()
        data = ds.get_parameter_data()

        for dataarray in data[DMM.v1.full_name].values():
            assert dataarray.shape == (n_points,)

        if n_points == n_points_expected:
            assert len(caplog.record_tuples) == 0
        elif n_points > n_points_expected:
>           assert len(caplog.record_tuples) == 2
E           AssertionError: assert 4 == 2
E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to awg_sim as there are no non weak references to the instrument....a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 76 and 5')])
E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to awg_sim as there are no non weak references to the instrument....a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 76 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7f1ce0b3beb0>.record_tuples

tests/dataset/measurement/test_shapes.py:42: AssertionError

The above exception was the direct cause of the following exception:
jenshnielsen commented 1 year ago

similar error here

______________________________ test_data______________________________ test_datasaver_1d _______________________________
[gw0] linux -- Python 3.12.0 /opt/hostedtoolcache/Python/3.12.0/x64/bin/python

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-[70](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:71)
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
n_points = 70

    @given(n_points=hst.integers(min_value=1, max_value=100))
    @example(n_points=5)
    @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
    def test_datasaver_1d(
        experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
    ) -> None:
        meas = Measurement()
        meas.register_parameter(DAC.ch1)
        meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))

        n_points_expected = 5

        meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})

        with meas.run() as datasaver:

            for set_v in np.linspace(0, 1, n_points):
                DAC.ch1()
                datasaver.add_result((DAC.ch1, set_v),
                                     (DMM.v1, DMM.v1()))

        ds = datasaver.dataset
        caplog.clear()
        data = ds.get_parameter_data()

        for dataarray in data[DMM.v1.full_name].values():
            assert dataarray.shape == (n_points,)

        if n_points == n_points_expected:
            assert len(caplog.record_tuples) == 0
        elif n_points > n_points_expected:
>           assert len(caplog.record_tuples) == 2
E           AssertionError: assert 4 == 2
E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples

tests/dataset/measurement/test_shapes.py:42: AssertionError

The above exception was the direct cause of the following exception:

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-70
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>

    @given(n_points=hst.integers(min_value=1, max_value=100))
>   @example(n_points=5)

tests/dataset/measurement/test_shapes.py:12: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hypothesis.core.StateForActualGivenExecution object at 0x7fe10b2ef260>
data = ConjectureData(VALID, 1 bytes, frozen)

    def execute_once(
        self,
        data,
        *,
        print_example=False,
        is_final=False,
        expected_failure=None,
        example_kwargs=None,
    ):
        """Run the test function once, using ``data`` as input.

        If the test raises an exception, it will propagate through to the
        caller of this method. Depending on its type, this could represent
        an ordinary test failure, or a fatal error, or a control exception.

        If this method returns normally, the test might have passed, or
        it might have placed ``data`` in an unsuccessful state and then
        swallowed the corresponding control exception.
        """

        self.ever_executed = True
        data.is_find = self.is_find

        text_repr = None
        if self.settings.deadline is None:
            test = self.test
        else:

            @proxies(self.test)
            def test(*args, **kwargs):
                self.__test_runtime = None
                initial_draws = len(data.draw_times)
                start = time.perf_counter()
                result = self.test(*args, **kwargs)
                finish = time.perf_counter()
                internal_draw_time = sum(data.draw_times[initial_draws:])
                runtime = datetime.timedelta(
                    seconds=finish - start - internal_draw_time
                )
                self.__test_runtime = runtime
                current_deadline = self.settings.deadline
                if not is_final:
                    current_deadline = (current_deadline // 4) * 5
                if runtime >= current_deadline:
                    raise DeadlineExceeded(runtime, self.settings.deadline)
                return result

        def run(data):
            # Set up dynamic context needed by a single test run.
            if self.stuff.selfy is not None:
                data.hypothesis_runner = self.stuff.selfy
            # Generate all arguments to the test function.
            args = self.stuff.args
            kwargs = dict(self.stuff.kwargs)
            if example_kwargs is None:
                a, kw, argslices = context.prep_args_kwargs_from_strategies(
                    (), self.stuff.given_kwargs
                )
                assert not a, "strategies all moved to kwargs by now"
            else:
                kw = example_kwargs
                argslices = {}
            kwargs.update(kw)
            if expected_failure is not None:
                nonlocal text_repr
                text_repr = repr_call(test, args, kwargs)
                if text_repr in self.xfail_example_reprs:
                    warnings.warn(
                        f"We generated {text_repr}, which seems identical "
                        "to one of your `@example(...).xfail()` cases.  "
                        "Revise the strategy to avoid this overlap?",
                        HypothesisWarning,
                        # Checked in test_generating_xfailed_examples_warns!
                        stacklevel=6,
                    )

            if print_example or current_verbosity() >= Verbosity.verbose:
                printer = RepresentationPrinter(context=context)
                if print_example:
                    printer.text("Falsifying example:")
                else:
                    printer.text("Trying example:")

                if self.print_given_args:
                    printer.text(" ")
                    printer.repr_call(
                        test.__name__,
                        args,
                        kwargs,
                        force_split=True,
                        arg_slices=argslices,
                        leading_comment=(
                            "# " + context.data.slice_comments[(0, 0)]
                            if (0, 0) in context.data.slice_comments
                            else None
                        ),
                    )
                report(printer.getvalue())
            return test(*args, **kwargs)

        # self.test_runner can include the execute_example method, or setup/teardown
        # _example, so it's important to get the PRNG and build context in place first.
        with local_settings(self.settings):
            with deterministic_PRNG():
                with BuildContext(data, is_final=is_final) as context:
                    # Run the test function once, via the executor hook.
                    # In most cases this will delegate straight to `run(data)`.
                    result = self.test_runner(data, run)

        # If a failure was expected, it should have been raised already, so
        # instead raise an appropriate diagnostic error.
        if expected_failure is not None:
            exception, traceback = expected_failure
            if (
                isinstance(exception, DeadlineExceeded)
                and self.__test_runtime is not None
            ):
                report(
                    "Unreliable test timings! On an initial run, this "
                    "test took %.2fms, which exceeded the deadline of "
                    "%.2fms, but on a subsequent run it took %.2f ms, "
                    "which did not. If you expect this sort of "
                    "variability in your test timings, consider turning "
                    "deadlines off for this test by setting deadline=None."
                    % (
                        exception.runtime.total_seconds() * 1000,
                        self.settings.deadline.total_seconds() * 1000,
                        self.__test_runtime.total_seconds() * 1000,
                    )
                )
            else:
                report("Failed to reproduce exception. Expected: \n" + traceback)
>           raise Flaky(
                f"Hypothesis {text_repr} produces unreliable results: "
                "Falsified on the first call but did not on a subsequent one"
            ) from exception
E           hypothesis.errors.Flaky: Hypothesis test_datasaver_1d(experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           -------------------------------------------------------------------------------------------------
E           1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E           2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E           3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E           4-results-4-dummy_dac_ch1,dummy_dmm_v1-[80](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:81)
E           5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E           6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           9-results-9-dummy_dac_ch1,dummy_dmm_v1-85
E           10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E           11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E           12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E           13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E           14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E           15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E           16-results-16-dummy_dac_ch1,dummy_dmm_v1-70, DAC=<DummyInstrument: dummy_dac>, DMM=<DummyInstrument: dummy_dmm>, caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f[83](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:84)0>, n_points=70) produces unreliable results: Falsified on the first call but did not on a subsequent one
E           Falsifying example: test_datasaver_1d(
E               experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E               -------------------------------------------------------------------------------------------------
E               1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E               2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E               3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E               4-results-4-dummy_dac_ch1,dummy_dmm_v1-80
E               5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E               6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E               7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E               8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E               9-results-9-dummy_dac_ch1,dummy_dmm_v1-[85](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:86)
E               10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E               11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E               12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E               13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E               14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E               15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E               16-results-16-dummy_dac_ch1,dummy_dmm_v1-70,
E               DAC=<DummyInstrument: dummy_dac>,
E               DMM=<DummyInstrument: dummy_dmm>,
E               caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>,
E               n_points=70,
E           )
E           Failed to reproduce exception. Expected: 
E           experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           --------------------...lts-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
E           caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
E           n_points = 70
E           
E               @given(n_points=hst.integers(min_value=1, max_value=100))
E               @example(n_points=5)
E               @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
E               def test_datasaver_1d(
E                   experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
E               ) -> None:
E                   meas = Measurement()
E                   meas.register_parameter(DAC.ch1)
E                   meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))
E               
E                   n_points_expected = 5
E               
E                   meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})
E               
E                   with meas.run() as datasaver:
E               
E                       for set_v in np.linspace(0, 1, n_points):
E                           DAC.ch1()
E                           datasaver.add_result((DAC.ch1, set_v),
E                                                (DMM.v1, DMM.v1()))
E               
E                   ds = datasaver.dataset
E                   caplog.clear()
E                   data = ds.get_parameter_data()
E               
E                   for dataarray in data[DMM.v1.full_name].values():
E                       assert dataarray.shape == (n_points,)
E               
E                   if n_points == n_points_expected:
E                       assert len(caplog.record_tuples) == 0
E                   elif n_points > n_points_expected:
E           >           assert len(caplog.record_tuples) == 2
E           E           AssertionError: assert 4 == 2
E           E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E           E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples
E           
E           tests/dataset/measurement/test_shapes.py:42: AssertionError
E           
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.91.0', b'AEU=') as a decorator on your test case

/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/hypothesis/core.py:[89](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:90)2: Flaky
----------------------------- Captured stdout call -----------------------------______________________________ test_datasaver_1d _______________________________
[gw0] linux -- Python 3.12.0 /opt/hostedtoolcache/Python/3.12.0/x64/bin/python

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-[70](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:71)
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
n_points = 70

    @given(n_points=hst.integers(min_value=1, max_value=100))
    @example(n_points=5)
    @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
    def test_datasaver_1d(
        experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
    ) -> None:
        meas = Measurement()
        meas.register_parameter(DAC.ch1)
        meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))

        n_points_expected = 5

        meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})

        with meas.run() as datasaver:

            for set_v in np.linspace(0, 1, n_points):
                DAC.ch1()
                datasaver.add_result((DAC.ch1, set_v),
                                     (DMM.v1, DMM.v1()))

        ds = datasaver.dataset
        caplog.clear()
        data = ds.get_parameter_data()

        for dataarray in data[DMM.v1.full_name].values():
            assert dataarray.shape == (n_points,)

        if n_points == n_points_expected:
            assert len(caplog.record_tuples) == 0
        elif n_points > n_points_expected:
>           assert len(caplog.record_tuples) == 2
E           AssertionError: assert 4 == 2
E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples

tests/dataset/measurement/test_shapes.py:42: AssertionError

The above exception was the direct cause of the following exception:

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-70
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>

    @given(n_points=hst.integers(min_value=1, max_value=100))
>   @example(n_points=5)

tests/dataset/measurement/test_shapes.py:12: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hypothesis.core.StateForActualGivenExecution object at 0x7fe10b2ef260>
data = ConjectureData(VALID, 1 bytes, frozen)

    def execute_once(
        self,
        data,
        *,
        print_example=False,
        is_final=False,
        expected_failure=None,
        example_kwargs=None,
    ):
        """Run the test function once, using ``data`` as input.

        If the test raises an exception, it will propagate through to the
        caller of this method. Depending on its type, this could represent
        an ordinary test failure, or a fatal error, or a control exception.

        If this method returns normally, the test might have passed, or
        it might have placed ``data`` in an unsuccessful state and then
        swallowed the corresponding control exception.
        """

        self.ever_executed = True
        data.is_find = self.is_find

        text_repr = None
        if self.settings.deadline is None:
            test = self.test
        else:

            @proxies(self.test)
            def test(*args, **kwargs):
                self.__test_runtime = None
                initial_draws = len(data.draw_times)
                start = time.perf_counter()
                result = self.test(*args, **kwargs)
                finish = time.perf_counter()
                internal_draw_time = sum(data.draw_times[initial_draws:])
                runtime = datetime.timedelta(
                    seconds=finish - start - internal_draw_time
                )
                self.__test_runtime = runtime
                current_deadline = self.settings.deadline
                if not is_final:
                    current_deadline = (current_deadline // 4) * 5
                if runtime >= current_deadline:
                    raise DeadlineExceeded(runtime, self.settings.deadline)
                return result

        def run(data):
            # Set up dynamic context needed by a single test run.
            if self.stuff.selfy is not None:
                data.hypothesis_runner = self.stuff.selfy
            # Generate all arguments to the test function.
            args = self.stuff.args
            kwargs = dict(self.stuff.kwargs)
            if example_kwargs is None:
                a, kw, argslices = context.prep_args_kwargs_from_strategies(
                    (), self.stuff.given_kwargs
                )
                assert not a, "strategies all moved to kwargs by now"
            else:
                kw = example_kwargs
                argslices = {}
            kwargs.update(kw)
            if expected_failure is not None:
                nonlocal text_repr
                text_repr = repr_call(test, args, kwargs)
                if text_repr in self.xfail_example_reprs:
                    warnings.warn(
                        f"We generated {text_repr}, which seems identical "
                        "to one of your `@example(...).xfail()` cases.  "
                        "Revise the strategy to avoid this overlap?",
                        HypothesisWarning,
                        # Checked in test_generating_xfailed_examples_warns!
                        stacklevel=6,
                    )

            if print_example or current_verbosity() >= Verbosity.verbose:
                printer = RepresentationPrinter(context=context)
                if print_example:
                    printer.text("Falsifying example:")
                else:
                    printer.text("Trying example:")

                if self.print_given_args:
                    printer.text(" ")
                    printer.repr_call(
                        test.__name__,
                        args,
                        kwargs,
                        force_split=True,
                        arg_slices=argslices,
                        leading_comment=(
                            "# " + context.data.slice_comments[(0, 0)]
                            if (0, 0) in context.data.slice_comments
                            else None
                        ),
                    )
                report(printer.getvalue())
            return test(*args, **kwargs)

        # self.test_runner can include the execute_example method, or setup/teardown
        # _example, so it's important to get the PRNG and build context in place first.
        with local_settings(self.settings):
            with deterministic_PRNG():
                with BuildContext(data, is_final=is_final) as context:
                    # Run the test function once, via the executor hook.
                    # In most cases this will delegate straight to `run(data)`.
                    result = self.test_runner(data, run)

        # If a failure was expected, it should have been raised already, so
        # instead raise an appropriate diagnostic error.
        if expected_failure is not None:
            exception, traceback = expected_failure
            if (
                isinstance(exception, DeadlineExceeded)
                and self.__test_runtime is not None
            ):
                report(
                    "Unreliable test timings! On an initial run, this "
                    "test took %.2fms, which exceeded the deadline of "
                    "%.2fms, but on a subsequent run it took %.2f ms, "
                    "which did not. If you expect this sort of "
                    "variability in your test timings, consider turning "
                    "deadlines off for this test by setting deadline=None."
                    % (
                        exception.runtime.total_seconds() * 1000,
                        self.settings.deadline.total_seconds() * 1000,
                        self.__test_runtime.total_seconds() * 1000,
                    )
                )
            else:
                report("Failed to reproduce exception. Expected: \n" + traceback)
>           raise Flaky(
                f"Hypothesis {text_repr} produces unreliable results: "
                "Falsified on the first call but did not on a subsequent one"
            ) from exception
E           hypothesis.errors.Flaky: Hypothesis test_datasaver_1d(experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           -------------------------------------------------------------------------------------------------
E           1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E           2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E           3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E           4-results-4-dummy_dac_ch1,dummy_dmm_v1-[80](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:81)
E           5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E           6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           9-results-9-dummy_dac_ch1,dummy_dmm_v1-85
E           10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E           11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E           12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E           13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E           14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E           15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E           16-results-16-dummy_dac_ch1,dummy_dmm_v1-70, DAC=<DummyInstrument: dummy_dac>, DMM=<DummyInstrument: dummy_dmm>, caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f[83](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:84)0>, n_points=70) produces unreliable results: Falsified on the first call but did not on a subsequent one
E           Falsifying example: test_datasaver_1d(
E               experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E               -------------------------------------------------------------------------------------------------
E               1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E               2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E               3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E               4-results-4-dummy_dac_ch1,dummy_dmm_v1-80
E               5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E               6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E               7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E               8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E               9-results-9-dummy_dac_ch1,dummy_dmm_v1-[85](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:86)
E               10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E               11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E               12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E               13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E               14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E               15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E               16-results-16-dummy_dac_ch1,dummy_dmm_v1-70,
E               DAC=<DummyInstrument: dummy_dac>,
E               DMM=<DummyInstrument: dummy_dmm>,
E               caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>,
E               n_points=70,
E           )
E           Failed to reproduce exception. Expected: 
E           experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           --------------------...lts-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
E           caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
E           n_points = 70
E           
E               @given(n_points=hst.integers(min_value=1, max_value=100))
E               @example(n_points=5)
E               @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
E               def test_datasaver_1d(
E                   experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
E               ) -> None:
E                   meas = Measurement()
E                   meas.register_parameter(DAC.ch1)
E                   meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))
E               
E                   n_points_expected = 5
E               
E                   meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})
E               
E                   with meas.run() as datasaver:
E               
E                       for set_v in np.linspace(0, 1, n_points):
E                           DAC.ch1()
E                           datasaver.add_result((DAC.ch1, set_v),
E                                                (DMM.v1, DMM.v1()))
E               
E                   ds = datasaver.dataset
E                   caplog.clear()
E                   data = ds.get_parameter_data()
E               
E                   for dataarray in data[DMM.v1.full_name].values():
E                       assert dataarray.shape == (n_points,)
E               
E                   if n_points == n_points_expected:
E                       assert len(caplog.record_tuples) == 0
E                   elif n_points > n_points_expected:
E           >           assert len(caplog.record_tuples) == 2
E           E           AssertionError: assert 4 == 2
E           E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E           E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples
E           
E           tests/dataset/measurement/test_shapes.py:42: AssertionError
E           
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.91.0', b'AEU=') as a decorator on your test case

/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/hypothesis/core.py:[89](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:90)2: Flaky
----------------------------- Captured stdout call -----------------------------saver_1d _______________________________
[gw0] linux -- Python 3.12.0 /opt/hostedtoolcache/Python/3.12.0/x64/bin/python

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-[70](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:71)
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
n_points = 70

    @given(n_points=hst.integers(min_value=1, max_value=100))
    @example(n_points=5)
    @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
    def test_datasaver_1d(
        experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
    ) -> None:
        meas = Measurement()
        meas.register_parameter(DAC.ch1)
        meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))

        n_points_expected = 5

        meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})

        with meas.run() as datasaver:

            for set_v in np.linspace(0, 1, n_points):
                DAC.ch1()
                datasaver.add_result((DAC.ch1, set_v),
                                     (DMM.v1, DMM.v1()))

        ds = datasaver.dataset
        caplog.clear()
        data = ds.get_parameter_data()

        for dataarray in data[DMM.v1.full_name].values():
            assert dataarray.shape == (n_points,)

        if n_points == n_points_expected:
            assert len(caplog.record_tuples) == 0
        elif n_points > n_points_expected:
>           assert len(caplog.record_tuples) == 2
E           AssertionError: assert 4 == 2
E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples

tests/dataset/measurement/test_shapes.py:42: AssertionError

The above exception was the direct cause of the following exception:

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-70
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>

    @given(n_points=hst.integers(min_value=1, max_value=100))
>   @example(n_points=5)

tests/dataset/measurement/test_shapes.py:12: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hypothesis.core.StateForActualGivenExecution object at 0x7fe10b2ef260>
data = ConjectureData(VALID, 1 bytes, frozen)

    def execute_once(
        self,
        data,
        *,
        print_example=False,
        is_final=False,
        expected_failure=None,
        example_kwargs=None,
    ):
        """Run the test function once, using ``data`` as input.

        If the test raises an exception, it will propagate through to the
        caller of this method. Depending on its type, this could represent
        an ordinary test failure, or a fatal error, or a control exception.

        If this method returns normally, the test might have passed, or
        it might have placed ``data`` in an unsuccessful state and then
        swallowed the corresponding control exception.
        """

        self.ever_executed = True
        data.is_find = self.is_find

        text_repr = None
        if self.settings.deadline is None:
            test = self.test
        else:

            @proxies(self.test)
            def test(*args, **kwargs):
                self.__test_runtime = None
                initial_draws = len(data.draw_times)
                start = time.perf_counter()
                result = self.test(*args, **kwargs)
                finish = time.perf_counter()
                internal_draw_time = sum(data.draw_times[initial_draws:])
                runtime = datetime.timedelta(
                    seconds=finish - start - internal_draw_time
                )
                self.__test_runtime = runtime
                current_deadline = self.settings.deadline
                if not is_final:
                    current_deadline = (current_deadline // 4) * 5
                if runtime >= current_deadline:
                    raise DeadlineExceeded(runtime, self.settings.deadline)
                return result

        def run(data):
            # Set up dynamic context needed by a single test run.
            if self.stuff.selfy is not None:
                data.hypothesis_runner = self.stuff.selfy
            # Generate all arguments to the test function.
            args = self.stuff.args
            kwargs = dict(self.stuff.kwargs)
            if example_kwargs is None:
                a, kw, argslices = context.prep_args_kwargs_from_strategies(
                    (), self.stuff.given_kwargs
                )
                assert not a, "strategies all moved to kwargs by now"
            else:
                kw = example_kwargs
                argslices = {}
            kwargs.update(kw)
            if expected_failure is not None:
                nonlocal text_repr
                text_repr = repr_call(test, args, kwargs)
                if text_repr in self.xfail_example_reprs:
                    warnings.warn(
                        f"We generated {text_repr}, which seems identical "
                        "to one of your `@example(...).xfail()` cases.  "
                        "Revise the strategy to avoid this overlap?",
                        HypothesisWarning,
                        # Checked in test_generating_xfailed_examples_warns!
                        stacklevel=6,
                    )

            if print_example or current_verbosity() >= Verbosity.verbose:
                printer = RepresentationPrinter(context=context)
                if print_example:
                    printer.text("Falsifying example:")
                else:
                    printer.text("Trying example:")

                if self.print_given_args:
                    printer.text(" ")
                    printer.repr_call(
                        test.__name__,
                        args,
                        kwargs,
                        force_split=True,
                        arg_slices=argslices,
                        leading_comment=(
                            "# " + context.data.slice_comments[(0, 0)]
                            if (0, 0) in context.data.slice_comments
                            else None
                        ),
                    )
                report(printer.getvalue())
            return test(*args, **kwargs)

        # self.test_runner can include the execute_example method, or setup/teardown
        # _example, so it's important to get the PRNG and build context in place first.
        with local_settings(self.settings):
            with deterministic_PRNG():
                with BuildContext(data, is_final=is_final) as context:
                    # Run the test function once, via the executor hook.
                    # In most cases this will delegate straight to `run(data)`.
                    result = self.test_runner(data, run)

        # If a failure was expected, it should have been raised already, so
        # instead raise an appropriate diagnostic error.
        if expected_failure is not None:
            exception, traceback = expected_failure
            if (
                isinstance(exception, DeadlineExceeded)
                and self.__test_runtime is not None
            ):
                report(
                    "Unreliable test timings! On an initial run, this "
                    "test took %.2fms, which exceeded the deadline of "
                    "%.2fms, but on a subsequent run it took %.2f ms, "
                    "which did not. If you expect this sort of "
                    "variability in your test timings, consider turning "
                    "deadlines off for this test by setting deadline=None."
                    % (
                        exception.runtime.total_seconds() * 1000,
                        self.settings.deadline.total_seconds() * 1000,
                        self.__test_runtime.total_seconds() * 1000,
                    )
                )
            else:
                report("Failed to reproduce exception. Expected: \n" + traceback)
>           raise Flaky(
                f"Hypothesis {text_repr} produces unreliable results: "
                "Falsified on the first call but did not on a subsequent one"
            ) from exception
E           hypothesis.errors.Flaky: Hypothesis test_datasaver_1d(experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           -------------------------------------------------------------------------------------------------
E           1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E           2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E           3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E           4-results-4-dummy_dac_ch1,dummy_dmm_v1-[80](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:81)
E           5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E           6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           9-results-9-dummy_dac_ch1,dummy_dmm_v1-85
E           10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E           11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E           12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E           13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E           14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E           15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E           16-results-16-dummy_dac_ch1,dummy_dmm_v1-70, DAC=<DummyInstrument: dummy_dac>, DMM=<DummyInstrument: dummy_dmm>, caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f[83](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:84)0>, n_points=70) produces unreliable results: Falsified on the first call but did not on a subsequent one
E           Falsifying example: test_datasaver_1d(
E               experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E               -------------------------------------------------------------------------------------------------
E               1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E               2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E               3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E               4-results-4-dummy_dac_ch1,dummy_dmm_v1-80
E               5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E               6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E               7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E               8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E               9-results-9-dummy_dac_ch1,dummy_dmm_v1-[85](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:86)
E               10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E               11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E               12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E               13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E               14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E               15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E               16-results-16-dummy_dac_ch1,dummy_dmm_v1-70,
E               DAC=<DummyInstrument: dummy_dac>,
E               DMM=<DummyInstrument: dummy_dmm>,
E               caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>,
E               n_points=70,
E           )
E           Failed to reproduce exception. Expected: 
E           experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           --------------------...lts-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
E           caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
E           n_points = 70
E           
E               @given(n_points=hst.integers(min_value=1, max_value=100))
E               @example(n_points=5)
E               @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
E               def test_datasaver_1d(
E                   experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
E               ) -> None:
E                   meas = Measurement()
E                   meas.register_parameter(DAC.ch1)
E                   meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))
E               
E                   n_points_expected = 5
E               
E                   meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})
E               
E                   with meas.run() as datasaver:
E               
E                       for set_v in np.linspace(0, 1, n_points):
E                           DAC.ch1()
E                           datasaver.add_result((DAC.ch1, set_v),
E                                                (DMM.v1, DMM.v1()))
E               
E                   ds = datasaver.dataset
E                   caplog.clear()
E                   data = ds.get_parameter_data()
E               
E                   for dataarray in data[DMM.v1.full_name].values():
E                       assert dataarray.shape == (n_points,)
E               
E                   if n_points == n_points_expected:
E                       assert len(caplog.record_tuples) == 0
E                   elif n_points > n_points_expected:
E           >           assert len(caplog.record_tuples) == 2
E           E           AssertionError: assert 4 == 2
E           E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E           E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples
E           
E           tests/dataset/measurement/test_shapes.py:42: AssertionError
E           
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.91.0', b'AEU=') as a decorator on your test case

/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/hypothesis/core.py:[89](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:90)2: Flaky
----------------------------- Captured stdout call -----------------------------______________________________ test_datasaver_1d _______________________________
[gw0] linux -- Python 3.12.0 /opt/hostedtoolcache/Python/3.12.0/x64/bin/python

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-[70](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:71)
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
n_points = 70

    @given(n_points=hst.integers(min_value=1, max_value=100))
    @example(n_points=5)
    @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
    def test_datasaver_1d(
        experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
    ) -> None:
        meas = Measurement()
        meas.register_parameter(DAC.ch1)
        meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))

        n_points_expected = 5

        meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})

        with meas.run() as datasaver:

            for set_v in np.linspace(0, 1, n_points):
                DAC.ch1()
                datasaver.add_result((DAC.ch1, set_v),
                                     (DMM.v1, DMM.v1()))

        ds = datasaver.dataset
        caplog.clear()
        data = ds.get_parameter_data()

        for dataarray in data[DMM.v1.full_name].values():
            assert dataarray.shape == (n_points,)

        if n_points == n_points_expected:
            assert len(caplog.record_tuples) == 0
        elif n_points > n_points_expected:
>           assert len(caplog.record_tuples) == 2
E           AssertionError: assert 4 == 2
E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples

tests/dataset/measurement/test_shapes.py:42: AssertionError

The above exception was the direct cause of the following exception:

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-70
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>

    @given(n_points=hst.integers(min_value=1, max_value=100))
>   @example(n_points=5)

tests/dataset/measurement/test_shapes.py:12: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hypothesis.core.StateForActualGivenExecution object at 0x7fe10b2ef260>
data = ConjectureData(VALID, 1 bytes, frozen)

    def execute_once(
        self,
        data,
        *,
        print_example=False,
        is_final=False,
        expected_failure=None,
        example_kwargs=None,
    ):
        """Run the test function once, using ``data`` as input.

        If the test raises an exception, it will propagate through to the
        caller of this method. Depending on its type, this could represent
        an ordinary test failure, or a fatal error, or a control exception.

        If this method returns normally, the test might have passed, or
        it might have placed ``data`` in an unsuccessful state and then
        swallowed the corresponding control exception.
        """

        self.ever_executed = True
        data.is_find = self.is_find

        text_repr = None
        if self.settings.deadline is None:
            test = self.test
        else:

            @proxies(self.test)
            def test(*args, **kwargs):
                self.__test_runtime = None
                initial_draws = len(data.draw_times)
                start = time.perf_counter()
                result = self.test(*args, **kwargs)
                finish = time.perf_counter()
                internal_draw_time = sum(data.draw_times[initial_draws:])
                runtime = datetime.timedelta(
                    seconds=finish - start - internal_draw_time
                )
                self.__test_runtime = runtime
                current_deadline = self.settings.deadline
                if not is_final:
                    current_deadline = (current_deadline // 4) * 5
                if runtime >= current_deadline:
                    raise DeadlineExceeded(runtime, self.settings.deadline)
                return result

        def run(data):
            # Set up dynamic context needed by a single test run.
            if self.stuff.selfy is not None:
                data.hypothesis_runner = self.stuff.selfy
            # Generate all arguments to the test function.
            args = self.stuff.args
            kwargs = dict(self.stuff.kwargs)
            if example_kwargs is None:
                a, kw, argslices = context.prep_args_kwargs_from_strategies(
                    (), self.stuff.given_kwargs
                )
                assert not a, "strategies all moved to kwargs by now"
            else:
                kw = example_kwargs
                argslices = {}
            kwargs.update(kw)
            if expected_failure is not None:
                nonlocal text_repr
                text_repr = repr_call(test, args, kwargs)
                if text_repr in self.xfail_example_reprs:
                    warnings.warn(
                        f"We generated {text_repr}, which seems identical "
                        "to one of your `@example(...).xfail()` cases.  "
                        "Revise the strategy to avoid this overlap?",
                        HypothesisWarning,
                        # Checked in test_generating_xfailed_examples_warns!
                        stacklevel=6,
                    )

            if print_example or current_verbosity() >= Verbosity.verbose:
                printer = RepresentationPrinter(context=context)
                if print_example:
                    printer.text("Falsifying example:")
                else:
                    printer.text("Trying example:")

                if self.print_given_args:
                    printer.text(" ")
                    printer.repr_call(
                        test.__name__,
                        args,
                        kwargs,
                        force_split=True,
                        arg_slices=argslices,
                        leading_comment=(
                            "# " + context.data.slice_comments[(0, 0)]
                            if (0, 0) in context.data.slice_comments
                            else None
                        ),
                    )
                report(printer.getvalue())
            return test(*args, **kwargs)

        # self.test_runner can include the execute_example method, or setup/teardown
        # _example, so it's important to get the PRNG and build context in place first.
        with local_settings(self.settings):
            with deterministic_PRNG():
                with BuildContext(data, is_final=is_final) as context:
                    # Run the test function once, via the executor hook.
                    # In most cases this will delegate straight to `run(data)`.
                    result = self.test_runner(data, run)

        # If a failure was expected, it should have been raised already, so
        # instead raise an appropriate diagnostic error.
        if expected_failure is not None:
            exception, traceback = expected_failure
            if (
                isinstance(exception, DeadlineExceeded)
                and self.__test_runtime is not None
            ):
                report(
                    "Unreliable test timings! On an initial run, this "
                    "test took %.2fms, which exceeded the deadline of "
                    "%.2fms, but on a subsequent run it took %.2f ms, "
                    "which did not. If you expect this sort of "
                    "variability in your test timings, consider turning "
                    "deadlines off for this test by setting deadline=None."
                    % (
                        exception.runtime.total_seconds() * 1000,
                        self.settings.deadline.total_seconds() * 1000,
                        self.__test_runtime.total_seconds() * 1000,
                    )
                )
            else:
                report("Failed to reproduce exception. Expected: \n" + traceback)
>           raise Flaky(
                f"Hypothesis {text_repr} produces unreliable results: "
                "Falsified on the first call but did not on a subsequent one"
            ) from exception
E           hypothesis.errors.Flaky: Hypothesis test_datasaver_1d(experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           -------------------------------------------------------------------------------------------------
E           1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E           2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E           3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E           4-results-4-dummy_dac_ch1,dummy_dmm_v1-[80](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:81)
E           5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E           6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           9-results-9-dummy_dac_ch1,dummy_dmm_v1-85
E           10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E           11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E           12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E           13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E           14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E           15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E           16-results-16-dummy_dac_ch1,dummy_dmm_v1-70, DAC=<DummyInstrument: dummy_dac>, DMM=<DummyInstrument: dummy_dmm>, caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f[83](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:84)0>, n_points=70) produces unreliable results: Falsified on the first call but did not on a subsequent one
E           Falsifying example: test_datasaver_1d(
E               experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E               -------------------------------------------------------------------------------------------------
E               1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E               2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E               3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E               4-results-4-dummy_dac_ch1,dummy_dmm_v1-80
E               5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E               6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E               7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E               8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E               9-results-9-dummy_dac_ch1,dummy_dmm_v1-[85](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:86)
E               10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E               11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E               12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E               13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E               14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E               15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E               16-results-16-dummy_dac_ch1,dummy_dmm_v1-70,
E               DAC=<DummyInstrument: dummy_dac>,
E               DMM=<DummyInstrument: dummy_dmm>,
E               caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>,
E               n_points=70,
E           )
E           Failed to reproduce exception. Expected: 
E           experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           --------------------...lts-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
E           caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
E           n_points = 70
E           
E               @given(n_points=hst.integers(min_value=1, max_value=100))
E               @example(n_points=5)
E               @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
E               def test_datasaver_1d(
E                   experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
E               ) -> None:
E                   meas = Measurement()
E                   meas.register_parameter(DAC.ch1)
E                   meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))
E               
E                   n_points_expected = 5
E               
E                   meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})
E               
E                   with meas.run() as datasaver:
E               
E                       for set_v in np.linspace(0, 1, n_points):
E                           DAC.ch1()
E                           datasaver.add_result((DAC.ch1, set_v),
E                                                (DMM.v1, DMM.v1()))
E               
E                   ds = datasaver.dataset
E                   caplog.clear()
E                   data = ds.get_parameter_data()
E               
E                   for dataarray in data[DMM.v1.full_name].values():
E                       assert dataarray.shape == (n_points,)
E               
E                   if n_points == n_points_expected:
E                       assert len(caplog.record_tuples) == 0
E                   elif n_points > n_points_expected:
E           >           assert len(caplog.record_tuples) == 2
E           E           AssertionError: assert 4 == 2
E           E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E           E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples
E           
E           tests/dataset/measurement/test_shapes.py:42: AssertionError
E           
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.91.0', b'AEU=') as a decorator on your test case

/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/hypothesis/core.py:[89](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:90)2: Flaky
----------------------------- Captured stdout call -----------------------------______________________________ test_datasaver_1d _______________________________
[gw0] linux -- Python 3.12.0 /opt/hostedtoolcache/Python/3.12.0/x64/bin/python

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-[70](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:71)
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
n_points = 70

    @given(n_points=hst.integers(min_value=1, max_value=100))
    @example(n_points=5)
    @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
    def test_datasaver_1d(
        experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
    ) -> None:
        meas = Measurement()
        meas.register_parameter(DAC.ch1)
        meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))

        n_points_expected = 5

        meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})

        with meas.run() as datasaver:

            for set_v in np.linspace(0, 1, n_points):
                DAC.ch1()
                datasaver.add_result((DAC.ch1, set_v),
                                     (DMM.v1, DMM.v1()))

        ds = datasaver.dataset
        caplog.clear()
        data = ds.get_parameter_data()

        for dataarray in data[DMM.v1.full_name].values():
            assert dataarray.shape == (n_points,)

        if n_points == n_points_expected:
            assert len(caplog.record_tuples) == 0
        elif n_points > n_points_expected:
>           assert len(caplog.record_tuples) == 2
E           AssertionError: assert 4 == 2
E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples

tests/dataset/measurement/test_shapes.py:42: AssertionError

The above exception was the direct cause of the following exception:

experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
--------------------...5-dummy_dac_ch1,dummy_dmm_v1-33
16-results-16-dummy_dac_ch1,dummy_dmm_v1-70
17-results-17-dummy_dac_ch1,dummy_dmm_v1-70
DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>

    @given(n_points=hst.integers(min_value=1, max_value=100))
>   @example(n_points=5)

tests/dataset/measurement/test_shapes.py:12: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hypothesis.core.StateForActualGivenExecution object at 0x7fe10b2ef260>
data = ConjectureData(VALID, 1 bytes, frozen)

    def execute_once(
        self,
        data,
        *,
        print_example=False,
        is_final=False,
        expected_failure=None,
        example_kwargs=None,
    ):
        """Run the test function once, using ``data`` as input.

        If the test raises an exception, it will propagate through to the
        caller of this method. Depending on its type, this could represent
        an ordinary test failure, or a fatal error, or a control exception.

        If this method returns normally, the test might have passed, or
        it might have placed ``data`` in an unsuccessful state and then
        swallowed the corresponding control exception.
        """

        self.ever_executed = True
        data.is_find = self.is_find

        text_repr = None
        if self.settings.deadline is None:
            test = self.test
        else:

            @proxies(self.test)
            def test(*args, **kwargs):
                self.__test_runtime = None
                initial_draws = len(data.draw_times)
                start = time.perf_counter()
                result = self.test(*args, **kwargs)
                finish = time.perf_counter()
                internal_draw_time = sum(data.draw_times[initial_draws:])
                runtime = datetime.timedelta(
                    seconds=finish - start - internal_draw_time
                )
                self.__test_runtime = runtime
                current_deadline = self.settings.deadline
                if not is_final:
                    current_deadline = (current_deadline // 4) * 5
                if runtime >= current_deadline:
                    raise DeadlineExceeded(runtime, self.settings.deadline)
                return result

        def run(data):
            # Set up dynamic context needed by a single test run.
            if self.stuff.selfy is not None:
                data.hypothesis_runner = self.stuff.selfy
            # Generate all arguments to the test function.
            args = self.stuff.args
            kwargs = dict(self.stuff.kwargs)
            if example_kwargs is None:
                a, kw, argslices = context.prep_args_kwargs_from_strategies(
                    (), self.stuff.given_kwargs
                )
                assert not a, "strategies all moved to kwargs by now"
            else:
                kw = example_kwargs
                argslices = {}
            kwargs.update(kw)
            if expected_failure is not None:
                nonlocal text_repr
                text_repr = repr_call(test, args, kwargs)
                if text_repr in self.xfail_example_reprs:
                    warnings.warn(
                        f"We generated {text_repr}, which seems identical "
                        "to one of your `@example(...).xfail()` cases.  "
                        "Revise the strategy to avoid this overlap?",
                        HypothesisWarning,
                        # Checked in test_generating_xfailed_examples_warns!
                        stacklevel=6,
                    )

            if print_example or current_verbosity() >= Verbosity.verbose:
                printer = RepresentationPrinter(context=context)
                if print_example:
                    printer.text("Falsifying example:")
                else:
                    printer.text("Trying example:")

                if self.print_given_args:
                    printer.text(" ")
                    printer.repr_call(
                        test.__name__,
                        args,
                        kwargs,
                        force_split=True,
                        arg_slices=argslices,
                        leading_comment=(
                            "# " + context.data.slice_comments[(0, 0)]
                            if (0, 0) in context.data.slice_comments
                            else None
                        ),
                    )
                report(printer.getvalue())
            return test(*args, **kwargs)

        # self.test_runner can include the execute_example method, or setup/teardown
        # _example, so it's important to get the PRNG and build context in place first.
        with local_settings(self.settings):
            with deterministic_PRNG():
                with BuildContext(data, is_final=is_final) as context:
                    # Run the test function once, via the executor hook.
                    # In most cases this will delegate straight to `run(data)`.
                    result = self.test_runner(data, run)

        # If a failure was expected, it should have been raised already, so
        # instead raise an appropriate diagnostic error.
        if expected_failure is not None:
            exception, traceback = expected_failure
            if (
                isinstance(exception, DeadlineExceeded)
                and self.__test_runtime is not None
            ):
                report(
                    "Unreliable test timings! On an initial run, this "
                    "test took %.2fms, which exceeded the deadline of "
                    "%.2fms, but on a subsequent run it took %.2f ms, "
                    "which did not. If you expect this sort of "
                    "variability in your test timings, consider turning "
                    "deadlines off for this test by setting deadline=None."
                    % (
                        exception.runtime.total_seconds() * 1000,
                        self.settings.deadline.total_seconds() * 1000,
                        self.__test_runtime.total_seconds() * 1000,
                    )
                )
            else:
                report("Failed to reproduce exception. Expected: \n" + traceback)
>           raise Flaky(
                f"Hypothesis {text_repr} produces unreliable results: "
                "Falsified on the first call but did not on a subsequent one"
            ) from exception
E           hypothesis.errors.Flaky: Hypothesis test_datasaver_1d(experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           -------------------------------------------------------------------------------------------------
E           1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E           2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E           3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E           4-results-4-dummy_dac_ch1,dummy_dmm_v1-[80](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:81)
E           5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E           6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           9-results-9-dummy_dac_ch1,dummy_dmm_v1-85
E           10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E           11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E           12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E           13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E           14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E           15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E           16-results-16-dummy_dac_ch1,dummy_dmm_v1-70, DAC=<DummyInstrument: dummy_dac>, DMM=<DummyInstrument: dummy_dmm>, caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f[83](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:84)0>, n_points=70) produces unreliable results: Falsified on the first call but did not on a subsequent one
E           Falsifying example: test_datasaver_1d(
E               experiment=test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E               -------------------------------------------------------------------------------------------------
E               1-results-1-dummy_dac_ch1,dummy_dmm_v1-5
E               2-results-2-dummy_dac_ch1,dummy_dmm_v1-1
E               3-results-3-dummy_dac_ch1,dummy_dmm_v1-56
E               4-results-4-dummy_dac_ch1,dummy_dmm_v1-80
E               5-results-5-dummy_dac_ch1,dummy_dmm_v1-95
E               6-results-6-dummy_dac_ch1,dummy_dmm_v1-10
E               7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E               8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E               9-results-9-dummy_dac_ch1,dummy_dmm_v1-[85](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:86)
E               10-results-10-dummy_dac_ch1,dummy_dmm_v1-85
E               11-results-11-dummy_dac_ch1,dummy_dmm_v1-58
E               12-results-12-dummy_dac_ch1,dummy_dmm_v1-58
E               13-results-13-dummy_dac_ch1,dummy_dmm_v1-45
E               14-results-14-dummy_dac_ch1,dummy_dmm_v1-15
E               15-results-15-dummy_dac_ch1,dummy_dmm_v1-33
E               16-results-16-dummy_dac_ch1,dummy_dmm_v1-70,
E               DAC=<DummyInstrument: dummy_dac>,
E               DMM=<DummyInstrument: dummy_dmm>,
E               caplog=<_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>,
E               n_points=70,
E           )
E           Failed to reproduce exception. Expected: 
E           experiment = test-experiment#test-sample#1@/tmp/pytest-of-runner/pytest-0/popen-gw0/test_datasaver_1d0/temp.db
E           --------------------...lts-6-dummy_dac_ch1,dummy_dmm_v1-10
E           7-results-7-dummy_dac_ch1,dummy_dmm_v1-66
E           8-results-8-dummy_dac_ch1,dummy_dmm_v1-70
E           DAC = <DummyInstrument: dummy_dac>, DMM = <DummyInstrument: dummy_dmm>
E           caplog = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>
E           n_points = 70
E           
E               @given(n_points=hst.integers(min_value=1, max_value=100))
E               @example(n_points=5)
E               @settings(deadline=None, suppress_health_check=(HealthCheck.function_scoped_fixture,))
E               def test_datasaver_1d(
E                   experiment, DAC, DMM, caplog: LogCaptureFixture, n_points
E               ) -> None:
E                   meas = Measurement()
E                   meas.register_parameter(DAC.ch1)
E                   meas.register_parameter(DMM.v1, setpoints=(DAC.ch1,))
E               
E                   n_points_expected = 5
E               
E                   meas.set_shapes({DMM.v1.full_name: (n_points_expected,)})
E               
E                   with meas.run() as datasaver:
E               
E                       for set_v in np.linspace(0, 1, n_points):
E                           DAC.ch1()
E                           datasaver.add_result((DAC.ch1, set_v),
E                                                (DMM.v1, DMM.v1()))
E               
E                   ds = datasaver.dataset
E                   caplog.clear()
E                   data = ds.get_parameter_data()
E               
E                   for dataarray in data[DMM.v1.full_name].values():
E                       assert dataarray.shape == (n_points,)
E               
E                   if n_points == n_points_expected:
E                       assert len(caplog.record_tuples) == 0
E                   elif n_points > n_points_expected:
E           >           assert len(caplog.record_tuples) == 2
E           E           AssertionError: assert 4 == 2
E           E            +  where 4 = len([('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')])
E           E            +    where [('qcodes.instrument.visa', 20, 'Closing VISA handle to n9030B_sim as there are no non weak references to the instrume...a shape for dummy_dac_ch1 in dataset dummy_dmm_v1 from metadata when loading but found inconsistent lengths 70 and 5')] = <_pytest.logging.LogCaptureFixture object at 0x7fe13262f830>.record_tuples
E           
E           tests/dataset/measurement/test_shapes.py:42: AssertionError
E           
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.91.0', b'AEU=') as a decorator on your test case

/opt/hostedtoolcache/Python/3.12.0/x64/lib/python3.12/site-packages/hypothesis/core.py:[89](https://github.com/QCoDeS/Qcodes/actions/runs/7042049661/job/19165580590?pr=5564#step:9:90)2: Flaky
----------------------------- Captured stdout call -----------------------------
ghost commented 9 months ago

we've seen similar failures in the unit tests full log

=========================== short test summary info ============================
FAILED tests/drivers/test_keithley_s46.py::test_query_close_once_at_init - assert 2 == 1
FAILED tests/parameter/test_parameter_ramp.py::test_step_ramp - assert 4 == 1

full log

=========================== short test summary info ============================
FAILED tests/parameter/test_parameter_ramp.py::test_step_ramp - assert 2 == 1