nedbat / coveragepy

The code coverage tool for Python
https://coverage.readthedocs.io
Apache License 2.0
2.96k stars 428 forks source link

Uncovered branches are reported if generator iterators are not exhausted #1617

Open godlygeek opened 1 year ago

godlygeek commented 1 year ago

Describe the bug Given this program as test.py:

def func():
    val = next((c for c in "abc"), None)
    print(val)

func()

Branch coverage claims that there is an uncovered arc from line 2 (the val = next(...)) to exit. As far as I can see, there is no branch there.

To Reproduce

$ python -V
Python 3.11.2
$ coverage --version
Coverage.py, version 7.2.4 with C extension
Full documentation is at https://coverage.readthedocs.io/en/7.2.4
$ coverage run --branch test.py
a
$ coverage report --show-missing
Name      Stmts   Miss Branch BrPart  Cover   Missing
-----------------------------------------------------
test.py       4      0      2      1    83%   2->exit
-----------------------------------------------------
TOTAL         4      0      2      1    83%

Interestingly, if you change line 2 to:

    val = next(iter("abc"), None)

Coverage no longer believes there is an uncovered branch, despite the fact that (c for c in "abc") and iter("abc") both return iterators over the same 3 elements.

Expected behavior There should not be a branch detected from line 2 to exit, as the only line that can follow 2 is 3. I suspect the issue may perhaps be that Coverage doesn't understand that 2-argument next will never raise a StopIteration.

Additional context This may be the same issue as https://github.com/nedbat/coveragepy/issues/605#issuecomment-1470759557 - I started to add this as a comment there, but then thought I'd err on the side of a new issue.

godlygeek commented 1 year ago

I suspect the issue may perhaps be that Coverage doesn't understand that 2-argument next will never raise a StopIteration.

Actually, that doesn't make sense, given that it doesn't happen with next(iter("abc"), None). It must not be two-argument next that's provoking it - maybe something related to generator expressions?

godlygeek commented 1 year ago

Further, this reproduces with:

    val = next(iter((c for c in "abc")), None)

but not with:

    val = next(iter([c for c in "abc"]), None)

so I'm fairly convinced it's something about generator expressions...

godlygeek commented 1 year ago

This is perhaps a minimal reproducer:

(c for c in ())
pass

That reports 1->exit as missing.

godlygeek commented 1 year ago

@nedbat helped me understand what's happening here, and it's much simpler than I was imagining. I wasn't correctly interpreting what the ->exit is trying to tell me: Coverage is reporting that the generator expression did not run to completion, because we didn't advance the returned iterator until StopIteration was raised. That's totally true.

It's possible to provoke a similar report with list comprehensions - for instance, this test program:

try:
    [x / 0 for x in (1, 2, 3)]
except ZeroDivisionError:
    pass

reports 2->exit as missing, because the exception stops the list comprehension from running to completion.

Having gathered my thoughts a bit after my conversation with Ned, I think that this isn't a helpful thing for Coverage to report.

Coverage doesn't complain about failures to exhaust other iterators, like iter("abc"). You'll say "but those aren't functions, like generators and list comprehensions are", but the language doesn't guarantee that comprehensions and generator expressions are implemented as functions, only that they have their own scope. And PEP 709 is proposing to stop implementing list comprehensions as nested functions in CPython (with a similar change for generator expressions left as a future improvement).

But moreover, Coverage doesn't complain when you fail to exhaust other iterators that are functions. There's no missing branch reported if you call iter(map(int, "123")) then don't advance the iterator until map.__next__ raises StopIteration.

You might say "but that uncovered branch is inside the implementation of map", but that's the case for generator expressions as well - the uncovered branch is where the implementation conditionally raises StopIteration if the user-provided iterable has been exhausted.

You might say "the difference is that the code running inside a generator expression is user-provided, not part of the interpreter's implementation like map", but that's only partly true. The uncovered branch isn't in the code provided by the user, it's in the interpreter's implementation of the generator expression machinery that's wrapping the user's code, which will return from the nested function and raise a StopIteration after the user-provided iterable has been exhausted. The Coverage user isn't trying to exercise the interpreter's generator expression implementation, so they don't gain any useful information from being told that there's a branch inside it that isn't taken.

I'm pretty convinced that this report isn't useful to or actionable by users - what do you think, @nedbat?

pganssle commented 1 year ago

I don't often require 100% branch coverage, so I don't know that I have a strong opinion on this, but one point in favor of the "this is not useful" side might be the question of what happens when you have infinite iterators?

Do you get the same kind of branch miss in this situation?

a = (x.upper() for x in itertools.repeat("Spam"))

print("\n".join(itertools.islice(a, 0, 100)))

Seems like you shouldn't, since exhausting a is impossible.

On the other hand, if you have a situation like this:

def my_func(n, a=(1, 2, 3.0, 4, "five")):
    it = (int(x) for x in a)
    return next(itertools.islice(it, n, n+1))

@pytest.mark.parameterize("n", [0, 1, 2, 3])
def test_my_func(n):
    assert isinstance(my_func(n), int)

It would be nice to recognize that you have missed the failing cases my_func(4) and my_func(5). Of course, by this logic, you would also want to make sure that coverage is emitting branch misses for next(iter("abc"), None) as well.

godlygeek commented 1 year ago

one point in favor of the "this is not useful" side might be the question of what happens when you have infinite iterators?

I thought about bringing infinite generators up, but Coverage already requires you to add a # pragma: no branch on infinite loops, so if it's ever worth reporting the "generator exhausted" branch being unreached, it would be consistent to expect people to explicitly mark cases where the generator can never be exhausted.