pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
12.04k stars 2.67k forks source link

Support for notebook debugging #11108

Closed rhshadrach closed 8 months ago

rhshadrach commented 1 year ago

Alternate title: A Notebook Debugger's Apology

Summary

We don't expect pytest to support our use case, but wanted to share it for awareness and in case anyone else can benefit from our code.

Context

We work in data science involving machine learning and optimization. Our sciences often require several datasets, where even the minimal amount of data to run our code is somewhat complex. While stepping through a test with a debugger is sometimes helpful, often analyzing a failing test requires studying the state of many data structures, each with a lot of data, and this can be difficult to do in a debugger. This investigative work can involve analyzing data and producing visualizations, and it isn't unheard of that the code to analyze a given scenario exceeds 50 lines. In our opinion, this type of exploratory analysis is where notebooks (e.g. Jupyter) shine. As we're working on fixing the code, we're often running the same code we did to analyze the scenario in order to see if our fix is working as expected.

Problem

Taking a pytest test and getting it to run in the notebook can be laborious. When fixtures are not used it typically is not too bad, but fixtures often require tracking down where the fixture is defined and copying the code behind the fixture. At times, fixtures depend on various functions, and all of this needs to be moved to or imported in the notebook.

Our solution

We wrote the following hack to make it easier to debug a failing test. The code is below the usage. The test is from pandas (for demonstration purposes) and can be found here. This test was chosen because it is fairly simple, uses a fixture, and uses a parametrization.

import pandas as pd

test_locals = run_pytest_single(pd, 'tests/groupby/test_groupby.py::test_pass_args_kwargs_duplicate_columns[True]')

print(test_locals.keys())
# dict_keys(['tsframe', 'as_index', 'gb', 'warn', 'msg', 'res', 'ex_data', 'expected'])

The run_pytest_single takes the package in which the test resides as well as the path to the test. The path includes any parametrizations, and is exactly that produced by pytest's summary. We are often copying it from our CI. The code uses pytest internals to setup and run the test function, and extracts any local variables from the test as a dictionary. We also (accidentally) return some locals that pytest has injected. In a notebook, we can then use this dictionary to investigate the state of the objects involved in the test.

So far the code has worked well in our tests, but we only use parametrizations and fixtures. It would not surprise us at all if there are various valid tests that would break our code.

Code ``` import contextlib import os import sys import traceback from typing import Any, Iterator, no_type_check class _persistent_locals: """Decorator to persist local variables of a function. From SO: https://stackoverflow.com/q/9186395/10285434 """ def __init__(self, func): self._locals = {} self.func = func def __call__(self, *args, **kwargs): def tracer(frame, event, arg): if event == 'return': self._locals = frame.f_locals.copy() # tracer is activated on next call, return or exception sys.setprofile(tracer) try: # trace the function call res = self.func(*args, **kwargs) finally: # disable tracer and replace with old one sys.setprofile(None) return res @property def locals(self): return self._locals @contextlib.contextmanager def temp_setattr(obj: Any, attr: str, value: Any) -> Iterator[None]: """Temporarily set attribute on an object. Vendored from pandas. Args: obj: Object whose attribute will be modified. attr: Attribute to modify. value: Value to temporarily set attribute to. Yields: obj with modified attribute. """ old_value = getattr(obj, attr) setattr(obj, attr, value) try: yield obj finally: setattr(obj, attr, old_value) @no_type_check def run_pytest_single(package, test: str) -> dict[str, Any]: """Run a pytest test viastring and get the local variables of the test. Args: path: pytest test relative to the root directory of the project. Returns: Local variables from the test, regardless if the test succeeds or fails. """ from _pytest.config import _prepareconfig from _pytest.main import Session from _pytest.fixtures import FixtureRequest # We don't understand what finalizers do, and they seem to cause issues. So far # disabling them entirely has worked. def disable_finalizers(*args, **kwargs): pass FixtureRequest._schedule_finalizers = disable_finalizers base_path = os.path.dirname(package.__file__) path = os.path.join(base_path, test) # Sometimes repo location is read-only with temp_setattr(sys, "dont_write_bytecode", True): # n0 to disable pytest-xdist; this will raise if xdist is not installed config = _prepareconfig([path, '-n0']) session = Session.from_config(config) config._do_configure() config.hook.pytest_sessionstart(session=session) config.hook.pytest_collection(session=session) # Only one test is supported if len(session.items) == 0: raise ValueError(f'No tests found with {test=}') elif len(session.items) > 1: raise ValueError(f'Multiple tests found with {test=}') session_test = session.items[0] request = session_test._request kwargs = {} for name in session_test.fixturenames: # The request fixture doesn't require any setup if name != 'request': fixturedef = session_test._fixtureinfo.name2fixturedefs[name][0] request._compute_fixture_value(fixturedef) kwargs[name] = request.getfixturevalue(name) test_function = _persistent_locals(session_test.function) try: test_function( **{arg: kwargs[arg] for arg in session_test._fixtureinfo.argnames} ) except Exception: traceback.print_exc() print('Test failed') else: print('Test passed') finally: return test_function.locals ```

Question

As mentioned, this code is quite the hack. I do not expect it to function across minor versions of pytest (although it hasn't broken yet!). Is it possible to better support this use case of pytest? I fully expect the answer to be that it would take quite an effort, and is not worth it. Still, thought it couldn't hurt to ask. Maybe someone can benefit from our code as well.

rhshadrach commented 1 year ago

I've created https://pypi.org/project/pytest-ndb/ (repo: https://github.com/rhshadrach/pytest-ndb). If there is no appetite to support something along these lines directly in pytest (and I think that's a reasonable stance), this issue can be closed.

RonnyPfannschmidt commented 1 year ago

There's a intent to move debuggable asserts to python via a pep

I'm happy to discuss moving thing's around

Zac-HD commented 8 months ago

pytest-ndb looks great! Best wishes with the project, and let us know if an update breaks it - we might be able to work something out or help with the corresponding update on your side.