astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
32.83k stars 1.1k forks source link

[red-knot] Test setup utilities #13789

Open sharkdp opened 1 month ago

sharkdp commented 1 month ago

This recently came up in a discussion. A lot of red-knot tests require some form of "setup" in the sense that they create variables of a particular type. This is not always straightforward. For example, to create a variable of type int, you need to make sure that it doesn't end up as a LiteralInt[…]. So some tests use a pattern like this:

def int_instance() -> int: ...

x = int_instance()  # to create x: int

To create a variable with a union type, a lot of tests follow a pattern like

x = a if flag or b  # to create x: A | B

It's unclear to me if this really requires any action, but I thought it might make sense to discuss this in a bit more detail. Here are some approaches (some less generic than others) that I could think of.

1. def f() -> MyDesiredType; x = f()

Upsides:

Downsides:

2. def f(x: MyDesiredType): …

Upsides:

Downsides:

3. a if flag or b

(only relevant for union types)

Upsides:

Downsides:

4. Helper functions like one_of(a, b)

We could inject new functions, just for testing purposes. For example, we might have a function similar to

def one_of[A, B](a: A, b: B) -> A | B:
    if random.randint(0, 1):
        return x
    else:
        return y

to easily create union types

Upsides(?):

Downsides:

5. A magic conjure function

I'm not even sure if this is technically possible, but other languages have ways to create values of type T out of nothing. Not actually, of course. But for the purpose of doing interesting things at "type check time". For example, C++ has std::declval<T>(). Rust has let x: T = todo!(). Functional languages have absurd :: ⊥-> T.

You can't specify explicit generic parameters in a function call in Python (?), so we couldn't do something like x = conjure[int | None](), but maybe there is some way to create a construct conceptually similar to

def conjure[T]() -> T: ...  # Python type checkers don't like this

I think I would personally prefer the simple def f(x: MyDesiredType): … approach, once we make that work.

AlexWaygood commented 1 month ago

You can't specify explicit generic parameters in a function call in Python (?), so we couldn't do something like x = conjure[int | None]()

Correct -- for now, the best you can do is x: int | None = conjure() or x = typing.cast(int | None, conjure()). But see https://peps.python.org/pep-0718/ for a proposal to change this. (Discussed at https://discuss.python.org/t/pep-718-subscriptable-functions/28457.)

AlexWaygood commented 1 month ago

(I edited your post to number your suggestions so it would be easier to discuss them, hope that's okay :-)


Your proposals (4) and (5) both involve injecting some sort of "magic" function into the namespace that we could use without any imports, which would then create instances of the types required for the test. My first instinct was that I didn't much like the idea, because in general I'd like the test snippets to be as close as possible to executable Python code. I think it's useful to keep a close resemblance between our test snippets and user code we'll actually be running on. I also think keeping our test snippets as close as possible to executable Python makes them much easier for us and external contributors to understand.

However, I then realised that this isn't really that different to what we already do with reveal_type. At runtime, reveal_type is not a builtin -- you have to import it from typing or typing_extensions if you want to use it in such a way that your code will not crash when you actually run your code with a Python interpreter. But we pretend it's a builtin, so that users can easily debug their type-checking results without having to add an import, and so that we can keep our test snippets concise. I argued against this when we were designing the test framework (I said we should have to explicitly import reveal_type in order to use it in test snippets), but @carljm pushed for it, partly on the grounds that it would significantly reduce the boilerplate of our tests. In retrospect, I think he was probably right; it would be a bit of a pain to have to import reveal_type in every test snippet.

The key differences with reveal_type are:


I think I would personally prefer the simple def f(x: MyDesiredType): … approach, once we make that work.

Yes, I think I agree. Mypy has quite an extensive test suite that works in a similar way to our new framework, and they've managed to do without a conjure() function or one_of(). That doesn't mean that the idea is bad, of course! But it does suggest that it should be possible to do without it. And even if our test snippets already don't look exactly the same as executable Python code would (due to all the unimported reveal_type usages), it's nice to limit the differences as much as possible.

One way that mypy test snippets do differ from executable Python is in their use of "fixture stubs". Rather than using their full vendored stdlib typeshed stubs (which is what they use for checking user code), in their tests they use a radically simplified version of typeshed. This speeds up their tests a lot, but it is very frequently a source of confusion for mypy developers and contributors, who often think they've fixed a bug only to realise that the type inference their users are seeing for standard-library functions is very different to the type inference they thought they had asserted in their test snippets.

carljm commented 1 month ago

Just for the sake of discussion, another possibility here is to allow "layering" files, so in a Markdown header section you can provide a file that will be shared by all sub-tests within that section. So this would let you write your own little utilities (or simply type-annotated variables in a stub file) and import/reuse them in a bunch of related tests. Downsides are less locality of tests, and more complexity in understanding the structure and behavior of a test.

I also don't want to do anything here that's specifically motivated by limitations we should lift soon, like not understanding function arguments, or unions in annotations.

I think on the whole my preference is also defining functions with typed arguments, in most cases.