pschanely / CrossHair

An analysis tool for Python that blurs the line between testing and type systems.
Other
1.03k stars 49 forks source link

`None` in `max` and `min` goes undetected #194

Open mristin opened 1 year ago

mristin commented 1 year ago

Consider the following program (a solution to https://www.hackerrank.com/challenges/mini-max-sum/problem):

from typing import List, Tuple, Optional

from icontract import require, ensure

@require(lambda numbers: all(1 <= number <= 10**9 for number in numbers))
@require(lambda numbers: 2 <= len(numbers) < 1000)
@ensure(lambda numbers, result: 1 <= result[0] < sum(numbers))
@ensure(lambda numbers, result: 1 <= result[1] < sum(numbers))
@ensure(lambda result: result[0] <= result[1])
def slow_and_simple_find_min_max(numbers: List[int]) -> Tuple[int, int]:
    """
    >>> slow_and_simple_find_min_max([1, 2, 3, 4, 5])
    (10, 14)
    """
    min_sum = None  # type: Optional[int]
    max_sum = None  # type: Optional[int]
    for i in range(len(numbers)):
        a_sum = 0
        for j in range(len(numbers)):
            if i == j:
                continue

            a_sum += numbers[j]

        min_sum = min(a_sum, min_sum)
        max_sum = max(a_sum, max_sum)

    return min_sum, max_sum

CrossHair 0.0.34 fails to figure out that putting a None in min or max is going to crash the program. The doctest fails with an exception, as expected.

The correct program looks like this (mind the lines min_sum = ... and max_sum = ...):

from typing import List, Tuple, Optional

from icontract import require, ensure

@require(lambda numbers: all(1 <= number <= 10**9 for number in numbers))
@require(lambda numbers: 2 <= len(numbers) < 1000)
@ensure(lambda numbers, result: 1 <= result[0] < sum(numbers))
@ensure(lambda numbers, result: 1 <= result[1] < sum(numbers))
@ensure(lambda result: result[0] <= result[1])
def slow_and_simple_find_min_max(numbers: List[int]) -> Tuple[int, int]:
    """
    >>> slow_and_simple_find_min_max([1, 2, 3, 4, 5])
    (10, 14)
    """
    min_sum = None  # type: Optional[int]
    max_sum = None  # type: Optional[int]
    for i in range(len(numbers)):
        a_sum = 0
        for j in range(len(numbers)):
            if i == j:
                continue

            a_sum += numbers[j]

        min_sum = min(a_sum, min_sum) if min_sum is not None else a_sum
        max_sum = max(a_sum, max_sum) if max_sum is not None else a_sum

    return min_sum, max_sum

I ran CrossHair 0.0.34 with:

crosshair  check --analysis_kind icontract .\playground\exercise_02.py

Letting CrossHair in watch mode run longer did not help either:

crosshair  watch --analysis_kind icontract .\playground\exercise_02.py
pschanely commented 1 year ago

Thank you for the bug report - I hope to get more!

This is a subtly difficult case. It's running afoul of some heuristics CrossHair uses to avoid reporting incorrect counterexamples when symbolics get passed to a module written in C that cannot tolerate them. Now that much of the standard library is patched, these heuristics are less important, but still play a role when using 3rd party libraries.

Looking over the symbolic integer implementation, it feels like I might be able to exclude it from the heuristic, though - catching the case your bring up here. I am going to investigate this soon and report back!

Eh, finally, one "guiding-principle" topic I've never said explicitly or put in the docs: CrossHair doesn't try very hard to catch the same kinds of issues that a static typechecker does. (e.g. mypy already complains about min_sum = min(a_sum, min_sum) in your example)

That said, CrossHair is a great compliment to mypy; note that even after the fix for your example, mypy complains about the return type of the function: it's Tuple[int, int], but the two return variables are typed as Optional[int]. You can "fix" this by adding a line like assert min_sum is not None and max_sum is not None - mypy will understand that the types are now non-None. But now you have to validate that the assert is accurate - and CrossHair is hopefully good at doing that part!

Anyway, as I said earlier - more to come.