lemurheavy / coveralls-public

The public issue tracker for coveralls.io
http://coveralls.io
124 stars 7 forks source link

Coverage report in coveralls seems incorrect #1736

Open AbdealiLoKo opened 1 year ago

AbdealiLoKo commented 1 year ago

I am seeing cases where the coverage in coveralls does not match what I expected.

I have a test (which is passing in my CI) that goes something like this:

def test_ordering(self, dialect):
    statement = select().add_columns(Column("text_col", Text())).order_by(Column("text_col", Text()))
    with pytest.raises(AssertionError, match='"Sort By" should not be done on a Text column as it is not supported in Oracle.'):
        statement.compile(dialect=dialect)

This is the coverage report I see in the coveralls UI: image

Something was fishy here cause :

  1. A raise statement is not a branching statement. So, why is coveralls telling me that some branches were not tested ?
  2. My test passed. And it was asserting for this exact error message (and I checked this error message is not present anywhere else) - so I could not understand what test am I expected to write to cover this...

So, I ran the testcase with pytest --cov --cov-report=html in my laptop.

This is the coverage report I see in htmlcov: image

According to this, that line is fully covered. If you'd like to check the issue - you can access it at https://coveralls.io/builds/63824265

AbdealiLoKo commented 1 year ago

One more note is that in my project - I have had other developers raise the same issue. This has been plaguing us for the past month - before that nobody spoke to me about it at least.

This happens in typescript with karma-coverage also, so it is not a pytest specific issue.

Other info that may be helpful:

afinetooth commented 1 year ago

Hi, @AbdealiLoKo. Thanks for the detailed description and the link. Always helpful. 🙏 One more thing that will help though is a link to the source file that's showing this branch coverage on or around Line 516-518. Can you please share that so I can examine it more closely? I'll do my best to find it in the meantime.

In general, though. Your observation seems totally accurate:

My initial thoughts / questions:

  1. Perhaps, since the raise is part of a conditional / control flow statement, the discrepancy relates to the Line number. It will help me to see the exact file and build so I can look at the underlying data and display logic to verify there's no issue there.
  2. Are you reporting branch coverage? in other words, do you have coverage.py / pytest-cov configured to include branch coverage in your reports? If not, I noticed that branch coverage is ON in your settings, so perhaps that's a source of the discrepancy.
  3. Finally, Coveralls doesn't make decisions about the coverage of any lines in any files of your project. It is neutral on such decisions and just takes at face value what your coverage report tells it, and assumes it's accurate. So it will also help me to know the source file (full path) so I can see what your original coverage report says about it.

To answer your follow-up comments / questions:

One more note is that in my project - I have had other developers raise the same issue. This has been plaguing us for the past month - before that nobody spoke to me about it at least.

Yeah, that's no good. Please share any additional instances that your colleagues have shared and I'll check them like this one.

This happens in typescript with karma-coverage also, so it is not a pytest specific issue.

Good to know. I will be interested in examining both Coveralls coverage reports (JSON) and, perhaps, compare them to the original reports in their original formats if it comes to that.

Other info that may be helpful:

  • We upload lcov files to coveralls - is that alright ? or is something else recommended ?

That's perfectly fine, but if you weren't aware, our official integrations have received major upgrades over the past year, and our Universal Coverage Reporter, which is now the underlying integration for v2 of our Coveralls GitHub Action, supports numerous different coverage report formats natively, including pytest-cov (whereas v1 of the action used the node-coveralls language integration under-the-hood, which only supports LCOV and is now somewhat out-of-date as well (we've even been recommending a more up-to-date fork of that project, coveralls-next).

Generally speaking, if you can use an official integration, many benefits to that.

Perfect. Try upgrading to v2. That version is smarter too, so you might even be able to step back from any current configuration settings and let it run in default mode, in which it should be able to automatically find and determine the correct format of your coverage report (as long as it's in some common locations).

Try starting with the standard usage example, which is a two (2) line configuration in default mode, or the parallel build usage example if you're running parallel builds.

  • We break our CI into multiple machines and are sending multiple lcov files from each machine.

Then you probably are running parallel builds. Shouldn't be an issue.

Let me know if you need any help updating your Coveralls integration(s).

The other benefit of using the official integration, BTW, as applies in this case with your use of the Coveralls GitHub Action, is that you can use one integration for multiple languages / report formats and thereby standardize (not to mention simplify) your Coveralls integrations.

AbdealiLoKo commented 1 year ago

Thanks for the reply @afinetooth

Answers to the initial questions:

  1. I have sent this info on the the mail id in your github profile
  2. Yes, my .coveragerc has:
    [run]
    branch = true
  3. I have sent this on email

Some replies to the other comments:

  1. I would say its tougher to reproduce the cases one by one to post them in a public forum as these are private repos. So, lets solve this issue and I can then report more as folks report it to me.
  2. First a note on the official integrations. We currently use coverallsapp/github-action@v2 which is also what you seem to be recommending. So, looks like we're doing the recommended thing.
  3. On LCOV - Looks like the recommendation here is to not specify any particular file but just let the uploader do its thing to "find the best file format". Will try this out - thanks for the note ! Minor note that the Usage section currently says:

    The action's step needs to run after your test suite has outputted an LCOV file.

Might need to be updated for new users

AbdealiLoKo commented 1 year ago

I was apprehensive that I could create a MRE. But I was able to reproduce the issue in a public repo here - https://github.com/AbdealiLoKo/coveralls-issue

Original issue - Minimum Reproducible Example:

I tried removing the files parameter so that coveralls github action can detect the coverage files on its own.

I don't know if I can download the coverage files from coveralls ... So I have kept all the coverage information in the github artifacts in the CI builds if you want to check them

Update: I tried to run my tests in a single machine, and my coveralls coverage increased from 86% -> 89% without any change in source or tests.

Here is the new build with all my tests running in 1 github worker:

This seems to indicate there is some issue in how coveralls is "merging" the multiple coverage files that are being sent to it. (Both .coverage and lcov files merging have this issue)

AbdealiLoKo commented 1 year ago

I seem to be seeing this more and more Another place where a assignment statement is saying there is a branch image

afinetooth commented 1 year ago

@AbdealiLoKo thanks for the updates. Apologies for the delay.

As we discussed in email, since you have a paid subscription, you can get support by directly emailing support@coveralls.io, or by scheduling a live support call. This board is for open-source users and meant for community response, though we do try to sweep new issues here at least once/week.

We have talked about some of these things elsewhere, but let me reply here as well for consistency and to fill any gaps:

  1. I have sent this info on the the mail id in your github profile

Got that and replied. You'll want to use support@coveralls.io going forward. 🙏

  1. Yes, my .coveragerc has:
[run]
branch = true

Great. Good to know.

  1. I have sent this on email

Got it. Have forwarded to support@coveralls.io and you will get a reply from there.

Some replies to the other comments:

  1. I would say its tougher to reproduce the cases one by one to post them in a public forum as these are private repos. So, lets solve this issue and I can then report more as folks report it to me.

Absolutely, and this is another reason it's best to use support@coveralls.io---because paid subscriptions are for private repos and private repos may be sensitive, so while it's fine to leave a question here, we recommend also following up in email to support@coveralls.io with any other details that could be sensitive.

  1. First a note on the official integrations. We currently use coverallsapp/github-action@v2 which is also what you seem to be recommending. So, looks like we're doing the recommended thing.

Excellent. Thanks for clarifying.

  1. On LCOV - Looks like the recommendation here is to not specify any particular file but just let the uploader do its thing to "find the best file format". Will try this out - thanks for the note !

Yes, and in your case, the format will be coverage.py/pytest-cov, and Coverage Reporter (the integration running "underneath" the Coveralls GitHub Action v2 will look for a .coverage file.

Minor note that the Usage section currently says:

The action's step needs to run after your test suite has outputted an LCOV file. Might need to be updated for new users

Yes! Thanks so much for catching that, we'll update that. (v1 only supported lcov).

afinetooth commented 1 year ago

Also wanting to reply in full here to cover any gaps:

 

I was apprehensive that I could create a MRE. But I was able to reproduce the issue in a public repo here - https://github.com/AbdealiLoKo/coveralls-issue

First of all, thanks so much for taking the time to reproduce an example.

I would say that, in the future, you can avoid creating public repos in order to generate MREs that you can share in this forum, just because you can feel free exchanging original examples from, or MREs created in, private repo's in email (support@coveralls.io).

But again, thanks for going to the effort here. It's very helpful to see the example free of context. And your insights are spot on. I think this could be related to coverage report merging.

 

Original issue - Minimum Reproducible Example:

OK, just to clarify:

Screenshot 2023-11-25 at 10 21 26 AM

Now, here's an observation, and a discovery while examining your second example, which I think is a likely factor:

 

I tried removing the files parameter so that coveralls github action can detect the coverage files on its own.

So again, just to clarify:

Screenshot 2023-11-25 at 10 47 09 AM

But we have a difference here per the observation and discovery I mention above:

Three (3) comments on that:

  1. That situation---ingesting more than one report---would most likely result in a merge operation, which you highlighted as the possible context for the root cause error. So we'll consider that.
  2. However, it's also notable that this situation did not change the final results. The raise statement is still being tagged as a partially covered branch.
  3. To be perfectly honest, I'm not entirely sure what the expected result should be upon finding two identical reports in different formats for the same fileset, or even that a merge is guaranteed there, so I will have to update you here about that after examining the coverage reports and comparing them to the previous example's.

(Again, these examples are very helpful, and interesting, so thanks again.)

Update (WIP...)

 

I don't know if I can download the coverage files from coveralls ... So I have kept all the coverage information in the github artifacts in the CI builds if you want to check them

Right now, that's not something we make available to end users (we're are considering it), but we do store them and I can access them as an admin of Coveralls. (As I said, we are considering making those available, but the feature would require careful UX design because of all the different permutations, and, at this point, that work has not begun.)

But thank you for storing your original reports as artifacts! I suspect we will need to compare the Coveralls JSON versions to the original versions to see if the originals were misread by the Coverage Reporter, whether or not that entailed a merge operation.

 

Update: I tried to run my tests in a single machine, and my coveralls coverage increased from 86% -> 89% without any change in source or tests.

Here is the new build with all my tests running in 1 github worker:

This seems to indicate there is some issue in how coveralls is "merging" the multiple coverage files that are being sent to it. (Both .coverage and lcov files merging have this issue)

Fascinating.

First of all, I did not realize that your two jobs were parallelized test runs. That's correct, though, right?

It took me a minute to realize that you are splitting the tests for a single test suite across two manually-defined jobs in your workflow, by using the split conditions:

And that, for this third example, you simply removed the first split condition for myapp1-coverage-data and let it be all the tests, vis-a-vis myapp-coverage-data.

(I am more used to seeing parallelization declared as an input option on a single job, that tells CI to split the jobs tests in the background.)

In any case, this should not be a problem, since Coveralls will receive as many coverage reports as you want to send and will merge them according to your instructions---the most common being to give all reports you want merged the same [flag-name](https://github.com/marketplace/actions/coveralls-github-action#:~:text=flag%2Dname,the%20Coveralls%20UI.).

So, another observation and two (2) recommendations on this:

So, in your case, I would recommend keeping your two "parallelized" jobs the same, and just making sure you give each the same **flag-name** like so:

jobs:
  test-myapp1:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v4
      - name: Install dependencies
        run: |
          python -m venv venv
          venv/bin/pip install -e .
          venv/bin/pip install -r requirements.txt
      - name: Run tests
        run: |
          venv/bin/pytest \
            -k "not test_text_order_by" \
            --cov --cov-report=lcov --cov-report=html
      - uses: actions/upload-artifact@v3
        with:
          name: myapp1-coverage-data
          path: |
            test_results/**
            htmlcov/**
      - name: Send coverage data to Coveralls
        uses: coverallsapp/github-action@v2
        with:
          file: test_results/myapp/coverage.lcov
          flag-name: myapp
          parallel: true

  test-myapp2:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v4
      - name: Install dependencies
        run: |
          python -m venv venv
          venv/bin/pip install -e .
          venv/bin/pip install -r requirements.txt
      - name: Run tests
        run: |
          venv/bin/pytest \
            -k "test_text_order_by" \
            --cov --cov-report=lcov --cov-report=html
      - uses: actions/upload-artifact@v3
        with:
          name: myapp2-coverage-data
          path: |
            test_results/**
            htmlcov/**
      - name: Send coverage data to Coveralls
        uses: coverallsapp/github-action@v2
        with:
          file: test_results/myapp/coverage.lcov
          flag-name: myapp
          parallel: true

  generate-coverage-report:
    runs-on: ubuntu-latest
    needs: [test-myapp1, test-myapp2]
    steps:
      - name: Coveralls
        env:
          github-token: ${{ secrets.GITHUB_TOKEN }}
        uses: coverallsapp/github-action@v2
        with:
          parallel-finished: true

Note that I'm simply giving each upload job the same flag-name, flag-name: myapp.

Also note that I'm encouraging the removal of lcov from the flag-name here, per my second recommendation, which is to stop converting your native coverage report to LCOV.

afinetooth commented 1 year ago

Just didn't want to leave this un-replied to:

I seem to be seeing this more and more Another place where a assignment statement is saying there is a branch image

That said, my message above should explain why you are probably seeing this sort of thing and how you can avoid it going forward, which is basically to follow the two (2) best practice recommendations above (Recommendation 1 | Recommendation 2).

afinetooth commented 1 year ago

@AbdealiLoKo I just wanted to summarize the above, since it represents a lot of reading, which I'm sorry for, but I hope addresses all the details you shared in the same spirit in which you shared those super helpful examples:

This also aligns with the analysis and recommendations I gave your colleague @indiVar0508, who also posted in this forum here (and previously here), whom I also met in a live support call about these very same issues (though using different examples).

Summary

I noticed two (2) antipatterns in your Coveralls config / setup and made two (2) best practice recommendations that I think should resolve the issues you're having.

That said, your examples did show some unexpected results that, even in the context of the "antipatterns" still seem like they could be bugs. In other words, Coveralls should probably handle the antipattern cases more gracefully, or else raise an exception, so I have reported those issues internally.

Antipatterns

  1. Antipattern 1: Converting a supported coverage report format in a way that results in duplicate reports: You are using an official Coveralls integration---the Coveralls GitHub Action, which uses the Universal Coverage Reporter under-the-hood---which supports your native coverage report format (coverage.py/pytest-cov), but you are also converting your native reports into lcov format. In addition to being unnecessary, this generates two coverage report files that, in default mode, the Coverage Reporter will find and ingest both of, which may lead to unexpected results when trying to merge two identical reports in different formats.
  2. Antipattern 2: Not merging parallelized coverage reports: You are "parallelizing" the runs of a single test suite in CI, but are not merging those reports, either in CI, or at Coveralls. This persists two (2) coverage reports from the same test suite, each with only a partial view of the coverage of each source file. While the aggregate coverage report that Coveralls generates for your builds should factor out those partial views, source file views for the individual Coveralls Jobs may end up showing less than a full picture of each file's coverage, since Coveralls aims to persist Job coverage to show how it impacts total coverage. In other words, Jobs, or Subprojects, typically represent the coverage report for a single test suite operating on a single fileset, and as a best practice we should aim to keep it that way. The canonical examples for each of these components of a Coveralls Parallel build are: (A) Jobs: where one job represents the front-end test suite, and another represents the back-end test suite for a multi-tier app; and (B) Subprojects: where each subproject represents the test suite for a different submodule, or package, of a monorepo.

Recommendations

  1. Don't convert your supported coverage report format - At least don't do so in a manner that results in multiple coverage reports in different formats. Or else, use the file: input option to specify which of the coverage reports the integration should use.
  2. Merge parallelized test suite runs - Which you can do on the CI side, or on the Coveralls side by making sure you associate the same flag-name with each parallelized coverage report upload.

Conclusions

This support request has been very instructive on our end, and your examples have been very helpful in uncovering some unexpected use cases that are, I'm sure, more likely than we would have expected. Given that, there are a couple of learnings here for us, as well as a couple of action items:

  1. Learning: It's unclear to end users that they don't need to convert coverage reports in supported formats, into other formats, primarily LCOV, which used to be the only format supported by v1 of our official integrations, and this is made worse by this misleading statement in our Coveralls Action README, leftover from v1: "The action's step needs to run after your test suite has outputted an LCOV file." Thanks for catching that. We will fix it.
  2. Learning: It's not clear to users that they should merge parallelized coverage reports, as opposed to coverage reports from parallel jobs. The term "parallel" as we apply it to Coveralls parallel builds, and the "parallel jobs" that comprise them, is too close to the term "parallelization" and a stronger distinction needs to be made between those two terms and their definitions so that end users understand that parallelized test runs (for single test suites) should be merged into a single job at Coveralls (possibly, itself a "parallel job"). We will think about how to best highlight this distinction as we update our docs and READMEs.
  3. Action Item: We need to update the Coveralls GitHub Action README to fix this misleading statement (hopefully gone the next time anyone clicks that link).
  4. Action Item: While an atypical situation, it appears that Coveralls may have a bug when trying to merge two identical coverage reports in different formats. The job of a Coveralls integration is to find, convert and upload a native-format coverage report into a Coveralls JSON-format coverage report, and upload it to the Coveralls API. In this scenario, it's crucial that any merge operation happen after each report has been converted to Coveralls JSON-format, which is the current order of operations; however, it's possible that we don't currently handle the case of two different formats of the same report being present at once. I have submitted the issue to our engineering team for review.

Thanks! 🙏

AbdealiLoKo commented 1 year ago

Antipattern 1: Understood Antipattern 2: This one doesn't seem like a rather common pattern. Because there are many projects that test for py3.7, py3.7, py3.8 and then send the coverage file for each Python version to coveralls-like services. This would most likely cause issues with coveralls

afinetooth commented 11 months ago

Antipattern 2: This one doesn't seem like a rather common pattern.

Understood. I am basically thinking of any scenario where someone in parallelizing test runs for performance (such as by using the parallelism setting in CircleCI (ex.parallelism: 8) and sending those to us with assigning each run the same flag_name (which merges those on our side).

Probably not common for your use case (or CI), but for those using parallelism, do this for uploads that will happen parallelism: n times:

  - coveralls/upload:
      flag_name: job1
      parallel: true

I'll close this for now. Just re-open if it becomes relevant again (or start a new issue). 🙏

afinetooth commented 11 months ago

@AbdealiLoKo I wanted to circle back to this issue, even though it was closed, by me, after making the best practice recommendations above.

I have since discovered more about the root cause here (after working on a similar issue) and would like to give you some additional direction to make sure you aren't still seeing issues like raise statements being treated as branch coverage.

First let me revisit the best practice recommendations, which still stand (I've condensed them here):

  1. Avoid duplicate coverage reports in multiple formats, or merge such reports before uploading.
  2. Merge parallelized coverage reports (same report across different runners) by giving them the same flag-name.
  3. Be more specific with the input options, format and file (or files)

So, I think that, given those recommendations, you were going to stop converting your .coverage file to LCOV (coverage.lcov). Which should have been a solution, however, we have since discovered an issue in our coverage.py parser (which is still in beta) and, while we are in-progress on a fix that should be released in Jan, I would recommend making these changes to your CI config:

    - name: Coveralls Parallel
      uses: coverallsapp/github-action@v2
      with:
        flag-name: whatever
        parallel: true
        format: lcov
        file: test_results/myapp/coverage.lcov
    - name: Coveralls Parallel
      uses: coverallsapp/github-action@v2
      with:
        flag-name: whatever
        parallel: true
        format: cobertura
        file: test_results/myapp/coverage.xml

I'd love to know if you're still seeing the issue with raise statements being considered branch coverage and whether the above changes fix that for you.

Thanks.

AbdealiLoKo commented 11 months ago

I went with Option 1.

Did not solve the issue:

AbdealiLoKo commented 11 months ago

I tried option 2 now. And it seems like with cobertura that issue does not exist anymore.

Note that even trying with different flag-names did not give the same issue. So, the culprit seems to be with lcov files only (sighs)

afinetooth commented 11 months ago

@AbdealiLoKo regarding this attempt:

I went with Option 1.

Did not solve the issue:

Coveralls build: https://coveralls.io/jobs/133885565/source_files/19184053316 [...]

I see that Coverage Reporter processed two coverage reports, each one covering all files in the project:

Screenshot 2023-12-28 at 9 20 34 AM

And each contained branch coverage data; and, in particular, marked Line 18 as being a branch:

CORRIDOR - PUBLIC ISSUES 1736 - CleanShot 2023-12-28 at 09 24 55

CORRIDOR - PUBLIC ISSUES 1736 - CleanShot 2023-12-28 at 09 22 44

So this suggests the original coverage report(s) (either coverage.lcov or .coverage) are treating the raise statement as a branch.

Maybe we can look into the original report to verify.

Are you able to share the original .coverage and coverage.lcov reports?


Update: Interestingly, I noticed that this is your CI config, from that commit: https://github.com/AbdealiLoKo/coveralls-issue/blob/31e3e04e32ea9a3b5c75fd8a12b30655cf8391cc/.github/workflows/test.yml

Screenshot 2023-12-28 at 9 44 47 AM

Which, AFAIK, should have instructed Coverage Reporter to only parse your coverage.lcov file.

Did you happen to parallelize your test suite in CI? (Such that we would have received two uploads of data from your coverage.lcov file?)


Update: Nevermind, I see that you are indeed sending two jobs: one for test-myapp1 and one for test-myapp2.

I don't recall what your break-up of duties is there, but both seem to be sending coverage data for all files in the project.

afinetooth commented 11 months ago

Regarding your second attempt:

I tried option 2 now. And it seems like with cobertura that issue does not exist anymore.

It's clear you're using the cobertura (XML) format report(s):

Screenshot 2023-12-28 at 9 52 55 AM

And the two (2) coverage reports:

Screenshot 2023-12-28 at 9 53 34 AM

Show no branch coverage for Line 18:

CORRIDOR - PUBLIC ISSUE 1736 - CleanShot 2023-12-28 at 09 54 41

CORRIDOR - PUBLIC ISSUE 1736 - CleanShot 2023-12-28 at 09 55 21

So, either there is an issue with the original coverage.lcov file, or an issue with our lcov parser.

I assume the issue is with our lcov parser, but, again, if you can share your original .coverage and coverage.lcov files, that will help me perform a full investigation.

🙏 🙏

AbdealiLoKo commented 11 months ago

Here are the 2 things being generated for myapp1 and myapp2 It contains all 4 coverage.py's formats: .coverage, .lcov, .XML, and .html

myapp1-coverage-data.zip myapp2-coverage-data.zip