Support for pytest-cov branch coverage

kinow commented 1 year ago

Hi,

Thanks a lot for pycobertura. I used it to automate the coverage reporting of an internal project. Since we are using a private GitHub repository, we used GitHub actions to run the coverage and add a comment with the Markdown produced by pycobertura.

However, today I noticed that we have --cov-branch enabled, and the reported values in the output of pycobertura do not match what's displayed in the cmd-line & HTML reports of pytest-cov.

Would it be possible to match that somehow? Maybe adding a new flag, and maybe calling the pytest-cov (as they seem to have a special way to calculate the coverage when using branches).

Thanks!

aconrad commented 1 year ago

Hi @kinow, can you provide some samples of what the coverage file looks like, what pycobertura reports, and what you'd like to see? I haven't worked much with branch coverage so I'm not surprised we might be lacking.

kinow commented 1 year ago

Hi @aconrad , I will use a sample Python project as my current project is private: https://github.com/kinow/protobuf-uml-diagram

But this should be the same with any Pytest project using pytest-cov. You can turn on branch coverage adding --cov-branch.

Command-line output:


---------- coverage: platform linux, python 3.10.6-final-0 -----------
Name                      Stmts   Miss Branch BrPart    Cover
-------------------------------------------------------------
protobuf_uml_diagram.py      96      0     28      0  100.00%
-------------------------------------------------------------
TOTAL                        96      0     28      0  100.00%

XML:

<?xml version="1.0" ?>
<coverage version="7.2.1" timestamp="1684479773746" lines-valid="96" lines-covered="96" line-rate="1" branches-valid="28" branches-covered="28" branch-rate="1" complexity="0">
    <!-- Generated by coverage.py: https://coverage.readthedocs.io/en/7.2.1 -->
    <!-- Based on https://raw.githubusercontent.com/cobertura/web/master/htdocs/xml/coverage-04.dtd -->
    <sources>
        <source>/home/kinow/Development/python/workspace/protobuf-uml-diagram</source>
    </sources>
    <packages>
        <package name="." line-rate="1" branch-rate="1" complexity="0">
            <classes>
                <class name="protobuf_uml_diagram.py" filename="protobuf_uml_diagram.py" complexity="0" line-rate="1" branch-rate="1">
                    <methods/>
                    <lines>
                        <line number="17" hits="1"/>
                        <line number="19" hits="1"/>
                        <line number="20" hits="1"/>
                        <line number="21" hits="1"/>
                        <line number="22" hits="1"/>
                        <line number="23" hits="1"/>
                        <line number="24" hits="1"/>
                        <line number="25" hits="1"/>
                        <line number="27" hits="1"/>
                        <line number="28" hits="1"/>
                        <line number="29" hits="1"/>
                        <line number="30" hits="1"/>
                        <line number="32" hits="1"/>
                        <line number="33" hits="1"/>
                        <line number="35" hits="1"/>
                        <line number="39" hits="1"/>
                        <line number="42" hits="1"/>
                        <line number="53" hits="1"/>
                        <line number="59" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="65" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="71" hits="1"/>
                        <line number="76" hits="1"/>
                        <line number="77" hits="1"/>
                        <line number="78" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="79" hits="1"/>
                        <line number="80" hits="1"/>
                        <line number="83" hits="1"/>
                        <line number="91" hits="1"/>
                        <line number="92" hits="1"/>
                        <line number="93" hits="1"/>
                        <line number="95" hits="1"/>
                        <line number="96" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="97" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="98" hits="1"/>
                        <line number="101" hits="1"/>
                        <line number="102" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="103" hits="1"/>
                        <line number="105" hits="1"/>
                        <line number="107" hits="1"/>
                        <line number="109" hits="1"/>
                        <line number="111" hits="1"/>
                        <line number="114" hits="1"/>
                        <line number="115" hits="1"/>
                        <line number="116" hits="1"/>
                        <line number="118" hits="1"/>
                        <line number="121" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="122" hits="1"/>
                        <line number="126" hits="1"/>
                        <line number="134" hits="1"/>
                        <line number="135" hits="1"/>
                        <line number="146" hits="1"/>
                        <line number="153" hits="1"/>
                        <line number="164" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="165" hits="1"/>
                        <line number="167" hits="1"/>
                        <line number="168" hits="1"/>
                        <line number="173" hits="1"/>
                        <line number="176" hits="1"/>
                        <line number="177" hits="1"/>
                        <line number="178" hits="1"/>
                        <line number="180" hits="1"/>
                        <line number="181" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="182" hits="1"/>
                        <line number="183" hits="1"/>
                        <line number="184" hits="1"/>
                        <line number="185" hits="1"/>
                        <line number="186" hits="1"/>
                        <line number="187" hits="1"/>
                        <line number="188" hits="1"/>
                        <line number="189" hits="1"/>
                        <line number="191" hits="1"/>
                        <line number="192" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="193" hits="1"/>
                        <line number="194" hits="1"/>
                        <line number="195" hits="1"/>
                        <line number="196" hits="1"/>
                        <line number="198" hits="1"/>
                        <line number="199" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="200" hits="1"/>
                        <line number="201" hits="1"/>
                        <line number="202" hits="1"/>
                        <line number="204" hits="1"/>
                        <line number="205" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="206" hits="1"/>
                        <line number="207" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="208" hits="1"/>
                        <line number="209" hits="1" branch="true" condition-coverage="100% (2/2)"/>
                        <line number="210" hits="1"/>
                        <line number="212" hits="1"/>
                        <line number="218" hits="1"/>
                        <line number="219" hits="1"/>
                        <line number="220" hits="1"/>
                        <line number="222" hits="1"/>
                        <line number="227" hits="1"/>
                        <line number="228" hits="1"/>
                        <line number="230" hits="1"/>
                    </lines>
                </class>
            </classes>
        </package>
    </packages>
</coverage>

And screenshot of the HTML report:

aconrad commented 1 year ago

Interesting, thanks for sharing @kinow. I didn't know about the condition-coverage. What is the status of branch="true" when the line coverage is not 100%? Can you share another example where a branch isn't covered at 100%, and also one that includes 0%?

kinow commented 1 year ago

Sure, here's another project I worked on some time ago that also uses pytest-cov with branch coverage.

https://github.com/cylc/cylc-flow/blob/c5f177024b15e392fa1a0ba0f5c1d908fa6dc4f6/.github/workflows/test_fast.yml#L28

coverage.xml in this gist https://gist.github.com/kinow/328f7313756022694638bca296aff4dd

kinow commented 1 year ago

FWIW I remember reading about how they calculate the coverage when branching is enabled (IIRC, because SonarQube was giving a different line & branch coverage?), and it was not very straightforward. But hopefully it's not too complicated and can be added to pycobertura.

aconrad commented 1 year ago

I see a few tasks here:

Allow the Cobertura parser to understand partial coverage. We currently parse lines and return a tuple (line_number, line_status) but line_status is either True or False for hit or miss, respectively. But things aren't so binary anymore with partial coverage, so we might want to consider returning line status to be partial, maybe "hit", "miss", and "partial". This would be a breaking API change, so we'd have to release v4.
When parsing source code, we create a representation of each line using Line. We might need to augment that to store conditional coverage info. Those lists of Line tuples are then passed around to our reporters.
Determine how to represent partially hit lines in various reports (summary tables, and source code rendering). Should we show the coverage percentage of conditional lines? I think we might just get away with it by using color and not showing anything else. The reader can likely determine what part of the conditional is missing.
Need to colorize lines to yellow/orange to show that it's a partially covered line. This must happen in HTML templates, CLI, and possibly in other reporters.

Would someone want to take a stab at it? Happy to review this work in smaller PRs so it's not too big to review and is more incremental. Anyone?

aconrad commented 1 year ago

Trying to take a stab at this...

aconrad commented 1 year ago

In the summary table, when listing missing lines, does it make sense if we count partial lines as missed? I'm not sure I want to introduce a "partial" column, I don't think it has much relevance to us. Thoughts?

kinow commented 1 year ago

Good question, @aconrad . I replaced pycobertura by pytest-cov's HTML report. To post that to a GitHub comment (using a GH action), I extracted what I wanted from pytest-cov's HTML with htmlq, and then converted into Markdown with Pandoc.

However, as soon as I updated the GH Action to comment pytest-cov output, a developer asked about the "Missing" column from pycobertura :slightly_smiling_face:

I couldn't find a solution as to where/how to add it to the existing table… so at the moment the GH bot creates a comment with the report extracted from pytest-cov, and below it adds a second table with just the missing lines.

I am not sure if that would be practical for projects with many files. It works for us for now as we have about 10-20 files. Maybe just rename or document that column as "Missing lines", and show that column next to the part about line coverage?

Thanks!

aconrad commented 1 year ago

@kinow Would #168 work for you? I haven't colorized the outputs to orange for partially covered statements where relevant, but the data should be in place.

aconrad commented 1 year ago

I've been playing with it a bit and I've been surprised by a few things.

For-loops are measured as conditionals branches. At first, I didn't understand because, to me, for-loops only have one outcome: iterating over the iterable. But after thinking about it more, I supposed there could be a way when the for-loop doesn't iterate if the iterable is empty. Here's an example:

And here is the part of the XML file that reports on the lines in the screenshot:

<line number="334" hits="1" branch="true" condition-coverage="100% (2/2)"/>
<line number="335" hits="1" branch="true" condition-coverage="100% (2/2)"/>
<line number="336" hits="1" branch="true" condition-coverage="50% (1/2)" missing-branches="335"/>
<line number="337" hits="1" branch="true" condition-coverage="100% (2/2)"/>
<line number="338" hits="1"/>
<line number="339" hits="1" branch="true" condition-coverage="100% (2/2)"/>
<line number="340" hits="1"/>
<line number="341" hits="1"/>

This one was also interesting to me:

While everything under the conditional says it's covered, it marks the if-statement as partially covered. Here is the relevant XML:

<line number="354" hits="1" branch="true" condition-coverage="100% (2/2)"/>
<line number="355" hits="1"/>
<line number="356" hits="1"/>
<line number="357" hits="1"/>
<line number="359" hits="1" branch="true" condition-coverage="50% (1/2)" missing-branches="365"/>
<line number="360" hits="1" branch="true" condition-coverage="100% (2/2)"/>
<line number="363" hits="1"/>
<line number="364" hits="1"/>
<line number="365" hits="1"/>

My understanding is that if self.show_source is always True, and while it covered all lines, the code was never executed when it was False.

aconrad commented 1 year ago

Another thing noteworthy about the partially hit for-loop: the missing-branches attribute in the XML file says that the unmet conditional is the line before it (line 335). The code never executed the path of going from line 336 to line 335, which would only happen if hunk was empty.

aconrad / pycobertura

Support for pytest-cov branch coverage #167