Specify dependency of each test case

honno commented 2 years ago

The test suite is mostly made up of test methods for each function (or array object method) in the spec. The majority of these tests require other functions to do everything we want, which is problematic when an underlying function does not work—you'll get a lot of spammy errors that actually relate to a single fundamental problem... or even false positives.

So I think it would be a good idea if we declared the dependencies of each test method, i.e. the functions we use and assume has correct behaviour. We could use these declarations to create a dependency graph, which could be hooked with pytest to prioritise zero/low-dependency tests first, and by default skipping tests that use functions we've deemed incorrect. This will benefit:

Array API adoptors, who would much more easily see what they should prioritise developing
Us, who'd be able to see areas where we can try to cut down the functions we depend on

Ideas for declaring dependencies per test method:

Decorator which lists function and method names
Automatically infer dependencies via looking at function code (either searching the code as string, or using the AST)

General challenges would be:

How to get pytests collection hook to do this
Supporting array object methods alongside the top-level functions
- As well as operator symbols

The dependency graph + auto skipping would also allow us to:

Remove some mirrored functions in array_helpers
Remove the module stubbing mechanism
Check that no tests is using the function they're testing for assertions

honno commented 1 year ago

I still think this would be nice, but I think the test suite could do with some more refinement generally for this to be feasible, and even then doesn't seem to be as big a deal as I made out it out to be. Every time I've seen the test suite used, devs are usually identifying that buggy fundamental components are preventing testing other things... it's just that it probably took them way longer than ideal heh.

Zac-HD commented 1 year ago

The 80/20 of this would be to group tests into core, standard, and then extension-xxx groups - I think this would be fairly easy with a module-level pytestmark = pytest.mark.core (etc). Then if you run into a problem, try pytest -m core and then pytest -m "core or standard" to check whether it's something foundational.

Hypothesis takes a similar approach using directories to organize our selftests (core=conjecture, cover + nocover, extensions, etc.) which works pretty well at higher volumes. Exact dependency tracking is a lovely idea in principle but AFAICT very rare in practice because it's not usually worth the trouble 😅

data-apis / array-api-tests

Specify dependency of each test case #51