Trying to understand the pycobertura diff report

aconrad / pycobertura

A code coverage diff tool for Cobertura reports

MIT License

116 stars 39 forks source link

Trying to understand the pycobertura diff report #142

Open chixiangzhou opened 2 years ago

chixiangzhou commented 2 years ago

Hi, @aconrad

I’ve recently started to use pycobertura for my on-going project at work. This is definitely a great tool and I’m already a big fan of this tool. I just have a few questions regarding the pycobertura diff command and its output diff report.

I’m not sure if I fully understand all the columns in the diff report (html format). Take the dummy files mentioned in README for example. The following is the screenshot:

I didn’t find more details on the definition for each column that I highlighted above, so I was trying to interpret each column in my own understanding. Please correct me if I’m wrong.

Column Stmts: My understanding is this column indicates how many lines were added or deleted in the new coverage file. For instance, if 2 new lines were added, use “+2” to indicate it. Similarly, if 2 lines were deleted, use “-2”. If no lines added or deleted, use “-“. Let me know if my understanding is correct.

Column Miss: This column is somewhat confusing. My understanding is that it is a delta part that shows the number of lines from covered to uncovered (or from uncovered to covered) in the new coverage file.

For the file dummy/dummy, it is straightforward to understand “-2”, which means two lines (Line 5 & 6) changed from uncovered to covered.

However, I’m NOT able to make sense of the number “+1” for the file dummy/dummy2. There are two lines (Line 2 & 4) that changed from uncovered to covered, which is -2. There is one line (Line 5) that changed from covered to uncovered, which is +1. In my opinion, it should be “-1” here (-2 + 1 = -1), instead of “+1”. Did I miss anything here?

For the file dummy/dummy3, I’m able to understand the “+2”, as two new lines (Line 1 & 2) were added and not covered. So it is marked as “+2”.

Column Cover: I also find this column is kinda confusing. I assume this also reflects a delta part that is presented in a percentage form. For the file dummy/dummy, I’m able to understand how +40% is calculated: 2 (lines coverage increase)/ 5 (total lines) = 0.4.

For the file dummy/dummy2, I cannot get how “-25%” was calculated. Can you explain?

For the file dummy/dummy3, it is marked as “-”, and I cannot get it either. What does “-” mean here? Does it mean 0%? And why is it 0%?

Column Missing: I assume this column also only shows the delta part. In other words, only the lines that have coverage increase/decrease will be listed here. If the same line gets covered (or uncovered) in both old and new coverage file, it won’t appear in the column, correct?

Thanks for your time in advance. Looking forward to hearing from you soon!

Best, Chixiang

chixiangzhou commented 2 years ago

Can anyone take a look at this issue?

aconrad commented 2 years ago

@chixiangzhou, thanks for your question.

Column Stmts: My understanding is this column indicates how many lines were added or deleted in the new coverage file. For instance, if 2 new lines were added, use “+2” to indicate it. Similarly, if 2 lines were deleted, use “-2”. If no lines added or deleted, use “-“. Let me know if my understanding is correct.

Yes.

Column Miss: This column is somewhat confusing. My understanding is that it is a delta part that shows the number of lines from covered to uncovered (or from uncovered to covered) in the new coverage file.

"Miss" is the number of missed (uncovered) lines introduced. We don't actually take covered lines into account, however much were covered. So as long as Miss is negative, there's still testing work that can be done on the PR. Does that make sense?

Column Cover: I also find this column is kinda confusing. I assume this also reflects a delta part that is presented in a percentage form. For the file dummy/dummy, I’m able to understand how +40% is calculated: 2 (lines coverage increase)/ 5 (total lines) = 0.4.

"Cover" takes the overall coverage before and after the change and shows the difference. It doesn't really take into account which lines were newly covered or not.

For the file dummy/dummy2, I cannot get how “-25%” was calculated. Can you explain?

So for dummy/dummy2, we have a total of 4 statements (over 5 lines).

In the previous version, the file dummy/dummy2 had 100% coverage and looked like this:

def baz():  # line 1, covered
    pass    # line2, covered

Then we introduced changes and now have a file with 4 statements, as per the screenshot. 3 statements are covered (on lines 1, 2, 4) and then 1 statement uncovered (line 5). So that's 75% coverage. And 75% - 100% is -25%. I hope that makes sense now.

For the file dummy/dummy3, it is marked as “-”, and I cannot get it either. What does “-” mean here? Does it mean 0%? And why is it 0%?

For dummy/dummy3, it's a new file. Since that file didn't exist before, we can't do a delta, and thus we show "-". Whether that's the right thing to display is debatable. Would we rather see "-100%" with the assumption that the inexistence of the file equals 0%? That's totally up for discussion. 😄

Column Missing: I assume this column also only shows the delta part. In other words, only the lines that have coverage increase/decrease will be listed here. If the same line gets covered (or uncovered) in both old and new coverage file, it won’t appear in the column, correct?

Yes. If pycobertura didn't have access to the sources (which is possible since they can be optional), it would be a way to see which statements have been covered/uncovered and let the user refer back to their source code to understand what the report is referring to.

We should definitely dedicate a section in the README to explain what each of those are. Would you be interested in contributing in such way?

aconrad commented 2 years ago

Also...

I’ve recently started to use pycobertura for my on-going project at work. This is definitely a great tool and I’m already a big fan of this tool.

Thanks for your kind words, I'm glad pycobertura is useful to other people out there! 🥰

chixiangzhou commented 2 years ago

Hi @aconrad,

Thank you so much for your detailed reply. I really appreciated it! I absolutely have a much better understanding now. :)

Still have a couple doubts that need your clarification.

"Miss" is the number of missed (uncovered) lines introduced. We don't actually take covered lines into account, however much were covered.

Still a bit confused about the definition of Column "Miss". If "Miss" only cares about the uncovered lines introduced, then take the dummy/dummy file for example.

Miss should show "0" in this column, since there were NO uncovered lines introduced and no red lines are highlighted. However, it shows "-2" (in green) here, which refers to the 2 green lines 5 & 6. I assume green means coverage increased? If so, looks like it is contradictory to the definition that Miss is the number of uncovered lines introduced?

So as long as Miss is negative, there's still testing work that can be done on the PR. Does that make sense?

I might miss something here. Correct me if I'm wrong. Did you mean to say "So as long as Miss is positive" here, instead of "negative"? As in the example dummy/dummy, two lines (Lines 5 & 6) changed from uncovered to covered, and "Miss" is marked as -2. So I assume "negative" means coverage increased? If so, the developer did a good job on testing work to improve the coverage.

In the previous version, the file dummy/dummy2 had 100% coverage and looked like this:

def baz(): # line 1, covered pass # line2, covered

The above definitely is a very useful piece of information to help the readers understand how it resulted in "-25%" in Column "Cover". Without this info as a context, it's really hard to know we should use 100% as the previous coverage in the formula.

For dummy/dummy3, it's a new file. Since that file didn't exist before, we can't do a delta, and thus we show "-". Whether that's the right thing to display is debatable. Would we rather see "-100%" with the assumption that the inexistence of the file equals 0%? That's totally up for discussion. 😄

Adding a new file is a fairly common use case in software development. IMO, we should do a delta here, with the assumption that the inexistence of the file equals 100%. When the developer adds that new file later and didn't cover any lines (exactly the case in the example dummy/dummy3), the coverage is 0% now. So the column "Cover" indicates 0% - 100% = -100%. In other words, those uncovered new lines caused the legacy N (tech debt) uncovered lines to go up. Does this make sense to you?

We should definitely dedicate a section in the README to explain what each of those are. Would you be interested in contributing in such way?

Sure. I'd be happy to contribute to this diff command section in the README. I strongly recommend to show the coverage info (before and after the changes introduced) in the diff dummy files examples, which will definitely clear a lot of confusions.

aconrad commented 2 years ago

Still a bit confused about the definition of Column "Miss". If "Miss" only cares about the uncovered lines introduced, then take the dummy/dummy file for example.

Miss should show "0" in this column, since there were NO uncovered lines introduced and no red lines are highlighted. However, it shows "-2" (in green) here, which refers to the 2 green lines 5 & 6. I assume green means coverage increased? If so, looks like it is contradictory to the definition that Miss is the number of uncovered lines introduced?

In dummy/dummy, we have 1 new statement introduced (line 6 with the + sign) which was covered (1 less miss), and then line 5 was previously uncovered but is now covered (1 less miss, totaling to -2 misses).

Imagine a different scenario for dummy/dummy: if we didn't have a test to call bar() then we'd have "+1" in the Miss column because the new statement on line 6 would have not been covered so we would introduce 1 new miss. Line 5 would have remained uncovered and thus we wouldn't account for any change in coverage status.

So as long as Miss is negative, there's still testing work that can be done on the PR. Does that make sense?

I might miss something here. Correct me if I'm wrong. Did you mean to say "So as long as Miss is positive" here, instead of "negative"?

Oh whoops, yes you are right. 😅

As in the example dummy/dummy, two lines (Lines 5 & 6) changed from uncovered to covered, and "Miss" is marked as -2. So I assume "negative" means coverage increased? If so, the developer did a good job on testing work to improve the coverage.

That's right. If it's green, it's good, and Miss "-2" would be green, which means that the developer wrote tests to help decrease the number of missed (uncovered) lines.

The above definitely is a very useful piece of information to help the readers understand how it resulted in "-25%" in Column "Cover". Without this info as a context, it's really hard to know we should use 100% as the previous coverage in the formula.

I wrote a paragraph in the README where we shouldn't rely on coverage rate, which is why we don't color it green or red. You can read about it here. When I wrote pycobertura, I was on the fence about whether I should even keep the coverage rate column or drop it because it can be misleading.

Adding a new file is a fairly common use case in software development. IMO, we should do a delta here, with the assumption that the inexistence of the file equals 100%. When the developer adds that new file later and didn't cover any lines (exactly the case in the example dummy/dummy3), the coverage is 0% now. So the column "Cover" indicates 0% - 100% = -100%. In other words, those uncovered new lines caused the legacy N (tech debt) uncovered lines to go up. Does this make sense to you?

Yes. So if we didn't show - then I think -100% makes the most sense to me. Wanna submit a PR for that? 😄

Sure. I'd be happy to contribute to this diff command section in the README.

Thanks!

I strongly recommend to show the coverage info (before and after the changes introduced) in the diff dummy files examples, which will definitely clear a lot of confusions.

Yes, that's another improvement we could do as well. Can you file a ticket for that?

chixiangzhou commented 2 years ago

In dummy/dummy, we have 1 new statement introduced (line 6 with the + sign) which was covered (1 less miss), and then line 5 was previously uncovered but is now covered (1 less miss, totaling to -2 misses).

Thanks for your explanation, @aconrad ! Now it makes sense to me. 😄

Okay, on top of this understanding, I found the number for Column Miss in dummy/dummy2 doesn't seem right to me. Please refer to the screenshot below:

We have 2 new statements introduced (Lines 2 & 4 with + sign), which are covered now (2 less misses). And line 5 was previously covered but now is NOT covered (1 more miss), which is totaling to -2 + 1 = -1 miss.

Did I miss anything here?

Yes. So if we didn't show - then I think -100% makes the most sense to me. Wanna submit a PR for that? 😄

I'd be happy to submit a PR. However, I'm a python newbie and it might take me some time to learn and ramp up. Also, given my current workload, I might not be able to promise any timeline about when I can finish this. I'll try my best, but just wanted to let you know there is no commitment for now. Hope you understand. 😅

Yes, that's another improvement we could do as well. Can you file a ticket for that?

Sorry, this is my first time to use github to ask questions. Can you please let me know where to file a ticket? 😅

aconrad commented 2 years ago

Okay, on top of this understanding, I found the number for Column Miss in dummy/dummy2 doesn't seem right to me. Please refer to the screenshot below:

We have 2 new statements introduced (Lines 2 & 4 with + sign), which are covered now (2 less misses). And line 5 was previously covered but now is NOT covered (1 more miss), which is totaling to -2 + 1 = -1 miss.

I understand the confusion here, and that might be more of a UI issue than a data issue (see explanation below). Technically speaking, we don't manipulate the "Miss" number, besides subtracting the total number of missed statements between the new and old coverage reports.

To better explain, I think it's best if we look at how the code was executed for dummy/dummy2:

# old code
def baz():  # line 1 (hit)
    pass    # line 2 (hit)

# old test
def test_baz():
    from dummy.dummy2 import baz
    baz()

Coverage result – Miss: 0 (we hit all statements)

# new code
def baz():   # line 1 (hit)
    c = 'c'  # line 2 (hit)
             # line 3
def bat():   # line 4 (hit)
    pass     # line 5 (miss)

# new test (unchanged)
def test_baz():
    from dummy.dummy2 import baz
    baz()

Coverage result – Miss: 1 (we hit 3/4 statements)

Diff miss: +1

The confusion you are running into is because when we render the source code and overlay the coverage information, we might "double-highlight" lines that were changed. In the example above, line 2 was already covered, only the code changed. Technically speaking, since line 2 was already covered, we could consider "this line was already covered, let's not report/highlight it". But pycobertura diff reports changes in coverage AND code changes at the same time! That's intended for a better user experience.

I think if the diff report showed a before/after for each column, things would make more sense when you look at it like you do.

I'd be happy to submit a PR. However, I'm a python newbie and it might take me some time to learn and ramp up. Also, given my current workload, I might not be able to promise any timeline about when I can finish this. I'll try my best, but just wanted to let you know there is no commitment for now. Hope you understand. 😅

No worries then, at the very least a ticket would be nice, so someone can attempt to work on it when they have time. 😉

Sorry, this is my first time to use github to ask questions. Can you please let me know where to file a ticket? 😅

I meant to create an issue (issue == ticket), just like you did here. The reason to create a new ticket instead of reusing this one is that this one was more a discussion than a bug/feature request. I hope this helps.

aconrad commented 2 years ago

Maybe "Missing" should only display the individual missing lines (no green lines, just red ones). My intention was to show a change of coverage/code status across lines but it might be more confusing than useful.

chixiangzhou commented 2 years ago

I meant to create an issue (issue == ticket), just like you did here. The reason to create a new ticket instead of reusing this one is that this one was more a discussion than a bug/feature request. I hope this helps.

Done. Created two tickets #144 and #145

Maybe "Missing" should only display the individual missing lines (no green lines, just red ones). My intention was to show a change of coverage/code status across lines but it might be more confusing than useful.

Agreed. Developers usually care more about the red ones (lines that become uncovered), and work on the tests to address them.