TileDB-Inc / conda-forge-nightly-controller

Centralized nightly CI builds for TileDB conda feedstocks
2 stars 2 forks source link

Nighly feedstock build failed #35

Closed github-actions[bot] closed 6 months ago

github-actions[bot] commented 6 months ago

Nightly feedstock build failure for TileDB-Py%20Feedstock%20Testing at https://dev.azure.com/TileDB-Inc/CI/_build?definitionId=5&_a=summary

jdblischak commented 6 months ago

Looks like a spurious test failure on one of the linux-aarch64 builds. I restarted it

=================================== FAILURES ===================================
_________ TestMultiIndexPropertySparse.test_multi_index_two_way_query __________

self = <tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>
order = 'C', ranges = [(24, 52)]
sparse_array_1d = '/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn'

    @given(
>       order=st.sampled_from(["C", "F", "U"]),
        ranges=st.lists(bounded_ntuple(length=2, min_value=-100, max_value=100)),
    )

tiledb/tests/test_multi_index-hp.py:85: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>, 'C', [(24, 52)], '/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn')
kwargs = {}, arg_drawtime = 0.007683432000249013, initial_draws = 2
start = 1222.930327267, result = None, finish = 1223.206991044
internal_draw_time = 0, runtime = datetime.timedelta(microseconds=276664)
current_deadline = datetime.timedelta(microseconds=250000)

    @proxies(self.test)
    def test(*args, **kwargs):
        arg_drawtime = sum(data.draw_times)
        initial_draws = len(data.draw_times)
        start = time.perf_counter()
        try:
            result = self.test(*args, **kwargs)
        finally:
            finish = time.perf_counter()
            internal_draw_time = sum(data.draw_times[initial_draws:])
            runtime = datetime.timedelta(
                seconds=finish - start - internal_draw_time
            )
            self._timing_features = {
                "time_running_test": finish - start - internal_draw_time,
                "time_drawing_args": arg_drawtime,
                "time_interactive_draws": internal_draw_time,
            }

        current_deadline = self.settings.deadline
        if not is_final:
            current_deadline = (current_deadline // 4) * 5
        if runtime >= current_deadline:
>           raise DeadlineExceeded(runtime, self.settings.deadline)
E           hypothesis.errors.DeadlineExceeded: Test took 276.66ms, which exceeds the deadline of 200.00ms

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh/lib/python3.9/site-packages/hypothesis/core.py:845: DeadlineExceeded

The above exception was the direct cause of the following exception:

self = <tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>
sparse_array_1d = '/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn'

    @given(
>       order=st.sampled_from(["C", "F", "U"]),
        ranges=st.lists(bounded_ntuple(length=2, min_value=-100, max_value=100)),
    )

tiledb/tests/test_multi_index-hp.py:85: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <hypothesis.core.StateForActualGivenExecution object at 0x40001d193640>
data = ConjectureData(VALID, 9 bytes, frozen)

    def execute_once(
        self,
        data,
        *,
        print_example=False,
        is_final=False,
        expected_failure=None,
        example_kwargs=None,
    ):
        """Run the test function once, using ``data`` as input.

        If the test raises an exception, it will propagate through to the
        caller of this method. Depending on its type, this could represent
        an ordinary test failure, or a fatal error, or a control exception.

        If this method returns normally, the test might have passed, or
        it might have placed ``data`` in an unsuccessful state and then
        swallowed the corresponding control exception.
        """

        self.ever_executed = True
        data.is_find = self.is_find

        self._string_repr = ""
        text_repr = None
        if self.settings.deadline is None:
            test = self.test
        else:

            @proxies(self.test)
            def test(*args, **kwargs):
                arg_drawtime = sum(data.draw_times)
                initial_draws = len(data.draw_times)
                start = time.perf_counter()
                try:
                    result = self.test(*args, **kwargs)
                finally:
                    finish = time.perf_counter()
                    internal_draw_time = sum(data.draw_times[initial_draws:])
                    runtime = datetime.timedelta(
                        seconds=finish - start - internal_draw_time
                    )
                    self._timing_features = {
                        "time_running_test": finish - start - internal_draw_time,
                        "time_drawing_args": arg_drawtime,
                        "time_interactive_draws": internal_draw_time,
                    }

                current_deadline = self.settings.deadline
                if not is_final:
                    current_deadline = (current_deadline // 4) * 5
                if runtime >= current_deadline:
                    raise DeadlineExceeded(runtime, self.settings.deadline)
                return result

        def run(data):
            # Set up dynamic context needed by a single test run.
            if self.stuff.selfy is not None:
                data.hypothesis_runner = self.stuff.selfy
            # Generate all arguments to the test function.
            args = self.stuff.args
            kwargs = dict(self.stuff.kwargs)
            if example_kwargs is None:
                a, kw, argslices = context.prep_args_kwargs_from_strategies(
                    (), self.stuff.given_kwargs
                )
                assert not a, "strategies all moved to kwargs by now"
            else:
                kw = example_kwargs
                argslices = {}
            kwargs.update(kw)
            if expected_failure is not None:
                nonlocal text_repr
                text_repr = repr_call(test, args, kwargs)
                if text_repr in self.xfail_example_reprs:
                    warnings.warn(
                        f"We generated {text_repr}, which seems identical "
                        "to one of your `@example(...).xfail()` cases.  "
                        "Revise the strategy to avoid this overlap?",
                        HypothesisWarning,
                        # Checked in test_generating_xfailed_examples_warns!
                        stacklevel=6,
                    )

            if print_example or current_verbosity() >= Verbosity.verbose:
                printer = RepresentationPrinter(context=context)
                if print_example:
                    printer.text("Falsifying example:")
                else:
                    printer.text("Trying example:")

                if self.print_given_args:
                    printer.text(" ")
                    printer.repr_call(
                        test.__name__,
                        args,
                        kwargs,
                        force_split=True,
                        arg_slices=argslices,
                        leading_comment=(
                            "# " + context.data.slice_comments[(0, 0)]
                            if (0, 0) in context.data.slice_comments
                            else None
                        ),
                    )
                report(printer.getvalue())

            if TESTCASE_CALLBACKS:
                printer = RepresentationPrinter(context=context)
                printer.repr_call(
                    test.__name__,
                    args,
                    kwargs,
                    force_split=True,
                    arg_slices=argslices,
                    leading_comment=(
                        "# " + context.data.slice_comments[(0, 0)]
                        if (0, 0) in context.data.slice_comments
                        else None
                    ),
                )
                self._string_repr = printer.getvalue()
                self._jsonable_arguments = {
                    **dict(enumerate(map(to_jsonable, args))),
                    **{k: to_jsonable(v) for k, v in kwargs.items()},
                }
            return test(*args, **kwargs)

        # self.test_runner can include the execute_example method, or setup/teardown
        # _example, so it's important to get the PRNG and build context in place first.
        with local_settings(self.settings):
            with deterministic_PRNG():
                with BuildContext(data, is_final=is_final) as context:
                    # Run the test function once, via the executor hook.
                    # In most cases this will delegate straight to `run(data)`.
                    result = self.test_runner(data, run)

        # If a failure was expected, it should have been raised already, so
        # instead raise an appropriate diagnostic error.
        if expected_failure is not None:
            exception, traceback = expected_failure
            if isinstance(exception, DeadlineExceeded) and (
                runtime_secs := self._timing_features.get("time_running_test")
            ):
                report(
                    "Unreliable test timings! On an initial run, this "
                    "test took %.2fms, which exceeded the deadline of "
                    "%.2fms, but on a subsequent run it took %.2f ms, "
                    "which did not. If you expect this sort of "
                    "variability in your test timings, consider turning "
                    "deadlines off for this test by setting deadline=None."
                    % (
                        exception.runtime.total_seconds() * 1000,
                        self.settings.deadline.total_seconds() * 1000,
                        runtime_secs * 1000,
                    )
                )
            else:
                report("Failed to reproduce exception. Expected: \n" + traceback)
>           raise Flaky(
                f"Hypothesis {text_repr} produces unreliable results: "
                "Falsified on the first call but did not on a subsequent one"
            ) from exception
E           hypothesis.errors.Flaky: Hypothesis test_multi_index_two_way_query(self=<tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>, order='C', ranges=[(24, 52)], sparse_array_1d='/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn') produces unreliable results: Falsified on the first call but did not on a subsequent one
E           Falsifying example: test_multi_index_two_way_query(
E               self=<tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>,
E               sparse_array_1d='/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn',
E               order='C',
E               ranges=[(24, 52)],
E           )
E           Unreliable test timings! On an initial run, this test took 276.66ms, which exceeded the deadline of 200.00ms, but on a subsequent run it took 69.81 ms, which did not. If you expect this sort of variability in your test timings, consider turning deadlines off for this test by setting deadline=None.
E           
E           You can reproduce this example by temporarily adding @reproduce_failure('6.92.2', b'AAAEKgAYKgA0AA==') as a decorator on your test case

../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh/lib/python3.9/site-packages/hypothesis/core.py:952: Flaky
=========================== short test summary info ============================
FAILED tiledb/tests/test_multi_index-hp.py::TestMultiIndexPropertySparse::test_multi_index_two_way_query - hypothesis.errors.Flaky: Hypothesis test_multi_index_two_way_query(self=<tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>, order='C', ranges=[(24, 52)], sparse_array_1d='/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn') produces unreliable results: Falsified on the first call but did not on a subsequent one
Falsifying example: test_multi_index_two_way_query(
    self=<tests.test_multi_index-hp.TestMultiIndexPropertySparse object at 0x40001d24ef10>,
    sparse_array_1d='/tmp/tiledb-disktestcase8oyexir2/tmpd6d2swhn',
    order='C',
    ranges=[(24, 52)],
)
Unreliable test timings! On an initial run, this test took 276.66ms, which exceeded the deadline of 200.00ms, but on a subsequent run it took 69.81 ms, which did not. If you expect this sort of variability in your test timings, consider turning deadlines off for this test by setting deadline=None.

You can reproduce this example by temporarily adding @reproduce_failure('6.92.2', b'AAAEKgAYKgA0AA==') as a decorator on your test case
= 1 failed, 505 passed, 1 skipped, 7 xfailed, 1 xpassed in 1011.46s (0:16:51) ==
WARNING: Tests failed for tiledb-py-0.24.1.dev7.2024_01_03-py39h7e1a3fa_0.conda - moving package to /home/conda/feedstock_root/build_artifacts/broken
TESTS FAILED: tiledb-py-0.24.1.dev7.2024_01_03-py39h7e1a3fa_0.conda
jdblischak commented 6 months ago

It passed on the second attempt