astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
32k stars 1.07k forks source link

[red-knot] Enhancing Diagnostics for Compare Expression Inference #13787

Open cake-monotone opened 2 days ago

cake-monotone commented 2 days ago

This issue was suggested by @carljm and is related to the following threads: https://github.com/astral-sh/ruff/pull/13712#discussion_r1797193581, https://github.com/astral-sh/ruff/pull/13781#discussion_r1803637742, https://github.com/astral-sh/ruff/pull/13712#discussion_r1801739049

Current Problems

Currently, the infer_binary_type_comparison, infer_lexicographic_type_comparison, and perform_rich_comparison functions in infer.rs return an Option<Type>, where None indicates that the comparison is not defined.

However, with the addition of comparison logic for Tuple and Union types, we’ve found that using Option may not provide sufficient diagnostic information.

For example, consider the following code:

(1, 2) < (1, "hello")

The Python runtime provides an appropriate diagnostic message:

Traceback (most recent call last):
  File "/home/cake/works/playground/test.py", line 9, in <module>
    (1, 2) < (1, "hello")
TypeError: '<' not supported between instances of 'int' and 'str'

In contrast, our current Option implementation would result in:

Operator `<` is not supported for types `tuple[Literal[1], Literal[2]]` and `tuple[Literal[1], Literal["hello"]]`

(This is just my expected result; comparison between int and str isn’t implemented yet.)

Implementation

I’m considering a straightforward change to the following structure. Any suggestions are welcome!

struct OperatorUnsupportedError<'db> {
    op: ast::CmpOp,
    lhs: Type<'db>,
    rhs: Type<'db>,
}

// ...

fn infer_binary_type_comparison(
    &mut self,
    left: Type<'db>,
    op: ast::CmpOp,
    right: Type<'db>,
) -> Result<Type<'db>, OperatorUnsupportedError<'db>>
carljm commented 2 days ago

Yes, looks great, thank you for writing this up!

carljm commented 2 days ago

One thing to consider is whether the diagnostic we want for your example is simply:

Operator `<` is not supported for types `Literal[2]` and `Literal["hello"]`

Or a more verbose variant where we give both the inner erroring types and the full types. Pyright gives both. I think we probably should as well, because the full outer type may also be inferred and non-obvious. So maybe something like:

Operator `<` is not supported for types `Literal[2]` and `Literal["hello"]`, in comparing `tuple[Literal[1], Literal[2]]` with `tuple[Literal[1], Literal["hello"]]`
cake-monotone commented 1 day ago

I think there’s a subtle difference in the final diagnostic I expect.

Operator `<` is not supported for types `int` and `str`

As you know, in the current implementation, literals of different types are not compared directly. First, they are converted to Type::Instance, and then an Err will be returned and propagated.

So, the result would be:

Operator `<` is not supported for types `int` and `str`, in comparing `tuple[Literal[1], Literal[2]]` with `tuple[Literal[1], Literal["hello"]]`

It may not be intuitive because the user doesn’t know where the int and str are coming from. However, I still agree that providing both the inner and outer types is the better approach. I’ll start working on it as you suggested soon!

MichaReiser commented 1 day ago

Messages from other tools:

TypeScript Operator '<' cannot be applied to types '[string, number]' and '[string, string]'.(2365)

Flow (Nice! We should consider something with our new diagnostics)

    2:   a < b
         ^ Cannot compare tuple type [1] to tuple type [2]. [invalid-compare]
        References:
        1: function test(a: [string, number], b: [string, string]) {
                            ^ [1]
        1: function test(a: [string, number], b: [string, string]) {
                                                 ^ [2]

Pyright

Operator "<" not supported for types "tuple[Literal[1], Literal[2]]" and "tuple[Literal[1], Literal['hello']]" (reportOperatorIssue)

cake-monotone commented 1 day ago

Thanks for reasearching this!!!!! Would it be possible to share an example of how Flow’s diagnostics could be applied to this case? I’m a little unsure about which parts of Flow’s diagnostics we should apply to.

By the way, pyright doesn't show inner-type either... :cry:

MichaReiser commented 1 day ago

Here's a link to flow's playground

What I like about flow is that it renders the longer types, including those declared outside of the message header. I would have to install their CLI to see how that looks. But I do like that they don't try to squeeze everything into the message.

carljm commented 1 day ago

I think something like Flow's diagnostic would be great, once we have a more complete diagnostics system in place that allows formatting code frames like that; I definitely wouldn't try to implement that now.

I guess the one thing it suggests is that maybe the error return value should include both the erroring types and their nodes (or at least node locations). But I also think probably we don't need to worry about that yet, either. The main thing is to change the return value to Result and propagate it correctly; if we do that, it's not hard to add more information to the Err variant in future once we have a new diagnostics system in place.

carljm commented 1 day ago

By the way, pyright doesn't show inner-type either

Weird, I definitely thought I recently saw a pyright diagnostic for a case like this, where it stacked multiple errors for the innermost vs outer types. But now I can't figure out what code example I was looking at that produced that.