Columns "Stmts", "Miss", and "Cover" are all "-" in the pycobertura diff report

chixiangzhou commented 2 years ago

Hi @aconrad

I recently found a weird issue in the pycobertura diff report. Columns "Stmts", "Miss", and "Cover" are all marked as "-" in the pycobertura diff report.

Please refer to the following screenshot (I crossed out the file names for data privacy purposes):

How do you interpret these "-" signs?

My understanding is :

Stmts: No lines added or deleted
Miss: No more (or less) missed lines
Cover: No coverage increase/decrease

In addition, to verify my understanding, I picked one file with all 3 columns marked "-" and checked its before and after coverage file. The developer didn't touch this file, and there were no line changes on this file. (In other words, the coverage is exactly the same before and after.)

So I was wondering if it would make more sense to NOT show this type of files that have no code coverage change. In my case, the diff report shows a huge list of such files, which, in turn, prevents the developers from detecting the files that actually have coverage change. Any thoughts?

aconrad commented 2 years ago

Hm, very strange. I've never seen this before. Was it working and then it stopped working or was the output always like shown in the screenshot? I can either think of an unexpected coverage file format. What language/tool do you use to generate the coverage report in XML? You could take a look at one of the sample XML files in the tests directory of pycobertura and check if your file structure is similar or not.

It's really hard to help without looking at the inputs you've provided to pycobertura. Can you send anything over?

chixiangzhou commented 2 years ago

Hi @aconrad

Thanks for your reply!

Was it working and then it stopped working or was the output always like shown in the screenshot?

The output was always like shown in the screenshot.

What language/tool do you use to generate the coverage report in XML?

I used the tool GCOVR to generate the coverage report in XML. More details can be found here: https://gcovr.com/en/stable/index.html

You could take a look at one of the sample XML files in the tests directory of pycobertura and check if your file structure is similar or not.

Yes, I checked the sample XML file in the tests dir, and looks like the file structure is similar to mine.

Can you send anything over?

Yes, I attached a zipped file that contains the coverage files before and after. (For data privacy purposes, I changed all the file names to some numbers.) Also, the coverage files are big once you unzip them. Archive.zip

aconrad commented 2 years ago

Thanks!

Oh wow! It takes so long to process those files! 20 minutes and counting with the command pycobertura diff <file1> <file2> on my recent Macbook Pro. I wonder where it spends most of its time...

chixiangzhou commented 2 years ago

Oh wow! It takes so long to process those files! 20 minutes and counting with the command pycobertura diff on my recent Macbook Pro. I wonder where it spends most of its time...

Yes, same here. Like I mentioned earlier, the 2 coverage files are really big. 😄

aconrad commented 2 years ago

Since I don't have access to the source code, I ran the following command:

pycobertura diff coverage_before.xml coverage_after.xml --no-source

and got back:

Filename    Stmts    Miss    Cover
----------  -------  ------  -------
1957.cpp    +1       -       +0.21%
2525.cpp    +3       +3      -0.08%
2527.cpp    +50      +50     -1.61%
2532.cpp    +10      +10     -0.06%
3003.cpp    -        -2      +0.27%
TOTAL       +64      +61     +0.00%

... which looks normal to me.

This was done against master. What version of pycobertura are you using? And how did you invoke pycobertura on the command line? The "-" just mean that there's nothing to report.

PS: Out of curiosity, I ran the command with time and got:

pycobertura diff coverage_before.xml coverage_after.xml --no-source  1939.72s user 48.64s system 54% cpu 1:00:17.27 total

I'm putting that info here for later reference should it become handy.

chixiangzhou commented 2 years ago

Hi @aconrad ,

When I used the command you provided, I got the same output as you did.

What version of pycobertura are you using?

I believe it is 2.1.0. I installed pycobertura using the following command: pip install pycobertura

And how did you invoke pycobertura on the command line?

I used the following command to invoke pycobertura:

pycobertura diff --format html \
                 --output diff.html \
                 --source1 $HOME/before/src \
                 --source2 $HOME/after/src \
                 coverage_before.xml \
                 coverage_after.xml

In addition, I noticed the sum of Column "Cover" seems not right. See the screenshot below:

After doing the math by adding up each row of "Cover", I got "-1.27%". However, the diff report shows "+0.00%".

aconrad commented 2 years ago

I used the following command to invoke pycobertura:

That command looks good to me. It's a little tricky to help with the bug you encountered without a full test scenario. Any chance you can put together a minimal viable use case that demonstrates the issue?

In addition, I noticed the sum of Column "Cover" seems not right.

That summary table only shows the files that have changed. The TOTAL row you see is correct but lacks decimal precision. That TOTAL accounts for the overall coverage change for the whole project. Given how large the coverage files are, the 64 changed statements represent less than 0.00% of coverage change, or 0.000001952086655% to be precise. Does that make sense?

$ grep '<coverage' coverage_before.xml | awk '{print $9}'
line-rate="0.027286969467622223"

$ grep '<coverage' coverage_after.xml | awk '{print $9}'
line-rate="0.027288921554277316"

chixiangzhou commented 2 years ago

Any chance you can put together a minimal viable use case that demonstrates the issue?

Due to data privacy reasons, I cannot directly sent over the source code. Probably I can send over a modified minimal viable use case that shows the issue. Let me think about it.

That summary table only shows the files that have changed. The TOTAL row you see is correct but lacks decimal precision. That TOTAL accounts for the overall coverage change for the whole project. Given how large the coverage files are, the 64 changed statements represent less than 0.00% of coverage change, or 0.000001952086655% to be precise. Does that make sense?

Thanks for your detailed explanation! Yes, it makes sense to me now.

aconrad / pycobertura

Columns "Stmts", "Miss", and "Cover" are all "-" in the pycobertura diff report #147