NOAA-FIMS / FIMS

The repository for development of FIMS
https://noaa-fims.github.io/FIMS/
GNU General Public License v3.0
16 stars 9 forks source link

[Bug]: fix code coverage reporting #242

Closed k-doering-NOAA closed 1 year ago

k-doering-NOAA commented 2 years ago

Describe the bug

Reports of code coverage are being uploaded by 2 separate job, 1 or the code coverage of g test and another for the codecoverage of test that. This leads to two different reports being pushed up to code cov, and thus not reliable code coverage values.

originally found in https://github.com/NOAA-FIMS/FIMS/issues/182

To Reproduce

See https://app.codecov.io/gh/NOAA-FIMS/FIMS

Expected behavior

A consistent single value for code coverage that includes coverage from c++ and r tests. Alternatively, following only 1 test.

Screenshots

No response

Which OS are you seeing the problem on?

No response

Which browser are you seeing the problem on?

No response

Which version of FIMS are you seeing the problem on?

No response

Additional Context

No response

Code of Conduct

k-doering-NOAA commented 2 years ago

@Bai-Li-NOAA I created this new issue so we can track progress on this.

I see what the problem is, but I'm not sure what the solution would be. A start would be to only track code coverage for 1 type of test, gtests or test that. Do you a priori see one framework as more important to track for FIMS than the other?

Bai-Li-NOAA commented 2 years ago

Thanks @k-doering-NOAA for creating this issue. I'm not sure what the solution would be either. It is hard to say one framework is more important than the other because it would be great if developers know if they need to write more C++ tests and R tests before merging in their code.

I think the C++ test coverage is sent to codecov every time we run run-googletest.yml while R test coverage is sent to codecov 5:30 after we make a pull request to the main branch. That may be why we get two code coverage values. I cannot remember why the R testthat coverage is scheduled at 5:30. Do you think we can modify the two workflows and make them submit the two reports simultaneously when we make a push/pull request to the main branch? It also helps developers figure out whether to write more tests before merging a pull request.

k-doering-NOAA commented 2 years ago

That seems like a good goal, for developers to see what their test that/ c++ code coverage is. I think if we are making and sending 2 reports, the stats will still look weird on codecov.io, but maybe what is most important for now is that developers know the code cov calcs for their PR and how those compare to the main branch.

k-doering-NOAA commented 1 year ago

Currently the r test that coverage is turned off and we push the gtest coverage reports up to codecov.io. @Andrea-Havron-NOAA suggested that maybe we could use another service besides codecov.io to push one of our reports too, since calculating a joint code coverage is tricky.

Andrea-Havron-NOAA commented 1 year ago

For example, coveralls could be used to calculate C++ test coverage and codecov.io for R test coverage

kellijohnson-NOAA commented 1 year ago

I found a few articles on how to combine the results of the code coverage from the R tests and the google tests into a single metric. We just have to combine running of the tests into a single action.

references

k-doering-NOAA commented 1 year ago

@MOshima-PIFSC and I met today and learned alot!

First, we learned that we have codecov running for each of the windows, linux, and mac builds in the run-googletest.yml github action, which seems duplicative! We added a change on a branch to only run the code cov step on Linux.

From @kellijohnson-NOAA's merging reports resource, it sounds like we could be uploading an R report and a C++ report, and code cov can figure out what to do with it! So we should be able to turn the R code coverage back on and be set!

However, there are some deeper issues with our coverage reports. We then looked at the FIMS codecov.io site (https://app.codecov.io/gh/NOAA-FIMS/FIMS?branch=main , which you can navigate from our codecov icon on the readme) and noticed there are a lot of reports uploaded on each commit, many that are not valid.

I looked into this a bit more and realized I don't think code coverage is finding a coverage file when we run gtest - it is just finding the call-calc-codecov.yml file and assuming that is a report, even though it is not. (eg., see this step: https://github.com/NOAA-FIMS/FIMS/actions/runs/5800255746/job/15721968546#step:12:32) However, it does have some 'valid' reports, supposedly from the run-googletest.yml, even though I don't see it in the ghactions log.

@Bai-Li-NOAA do you know if there is one created when we run google test? I tried to look, but I couldn't find anything :(

Bai-Li-NOAA commented 1 year ago

@k-doering-NOAA, you are right. Maybe we should delete the codecov section from the run-googletest workflow? I kind of remember that I was able to run covr::package_coverage() on codespaces to get both C++ and R code coverage for FIMS. But I cannot reproduce it at this moment because my personal codespaces account only lets me install a 4-core machine and I cannot load FIMS there.

k-doering-NOAA commented 1 year ago

@Bai-Li-NOAA that is absolutely brilliant! For some reason it never crossed my mind that the r covr::codecov() function might include c++ coverage!

I do think it is probably not accounting for coverage by googletest, only by testthat tests that run c++ code, but I think this is a far better metric so I opened a pull request.

@MOshima-PIFSC and I have some time scheduled next week to work on this, so maybe we can investigate how to push up a proper report from googletest. In theory, Codecov.io should be able to reconcile the 2 reports and give us total coverage. :)

(an aside: I also could only create a 4 core machine, and somehow was able to run covr::package_coverage() on it. Trying to run on windows did not work for me with FIMS, although I could run the function to calc coverage on other packages when I was on windows).

MOshima-PIFSC commented 1 year ago

@k-doering-NOAA I found this article about using Codecov for C++ tests that might be helpful for us. In the screenshot of their github action workflow, there is a "Generate Report" section where the HelloCov.xml is created. I think this is the coverage report that gets uploaded to Codecov in the next step. Maybe we can try this for the google tests?

k-doering-NOAA commented 1 year ago

Thanks @MOshima-PIFSC, this is really helpful! The approach makes sense, although I don't quite understand all the details. If we run into issues with OpenCppCoverage (looks like it is for windows only), it looks like there are other tools like gcov (for Linux) and a few examples of using it with gtest

MOshima-PIFSC commented 1 year ago

@k-doering-NOAA and I met today and had some success. We discovered that to produce the coverage reports we can install gcovr using pip install gcovr and then run gcovr. This generated html reports of coverage for each file. One thing we realized is that it is creating reports for a lot of files that we don't need, like the tests themselves. It would be best to figure out how to exclude certain directories. The gcovr documentation site was helpful for determining how to generate reports. We still need to figure out how to get the reports to Codecov. I also just found this Github Action that looks like it might produce the gcovr reports and has an option to exclude directories. Now that we understand a little better what is going on, this might be an easier way to generate the reports in the github action.

k-doering-NOAA commented 1 year ago

@MOshima-PIFSC and I met - we started a GitHub action which can generate a gcov coverage report (code coverage from the google tests)!

Codecov still is flagging the report generated as an unusable report, though, and I'm not sure why. Other paths to explore are trying the .xml format from gcov and trying the gcov option in the code cov upload report github action.

MOshima-PIFSC commented 1 year ago

I changed the format to .xml and it seems to work. Codecov is no longer saying that report is unusable. However, it is still reporting coverage for the tests, so we need to fix that. I included the --exclude-directories tests/$ flag to the gcovr, however it's not working. It's probably something small like a syntax error I am missing. Documentation for excluding directories is here.

k-doering-NOAA commented 1 year ago

Code coverage report on main is looking good! Our coverage across R and C++ code from the gtest and testthat is 83% 🏆