code coverage by test suites

eclare108213 commented 6 years ago

Implement testing options to maximize how much of the code is exercised. We can use the Icepack tests (though some namelist options are different) and will need additional tests for dynamics and infrastructure. Consider using codecov.io, as suggested by @anders-dc:

Some automated tools can be used to test for "code coverage", a metric for measuring what parts of the code are being covered by the applied test suite. I use the tool codecov.io to visualize the results and you can see an example here:

This is the status page for a repository of mine: https://codecov.io/github/anders-dc/Granular.jl?branch=master

Here is an overview of the source folder: https://codecov.io/gh/anders-dc/Granular.jl/tree/master/src

And this is what a single source file looks like: https://codecov.io/gh/anders-dc/Granular.jl/src/master/src/temporal.jl

Green lines are hit by the test suite and red lines are never invoked. You can see that my tests hit most of the lines of code in this file, while not being perfect. This overview can help me design tests for parts that are untouched, and also get rid of abandoned code that was never used anyway. The algorithm is clever enough to not count comments and other irrelevant lines.

I think it is excessive to strive for 100% code coverage, and a high code coverage does (obviously) not guarantee a bug free code. However, I have found that the metric helps build confidence in the testing, and keeps me on my toes for writing tests for newly developed code.

ghost commented 6 years ago

Thank you for considering this.

I will begin implementing the code coverage in Icepack. It should not require much except for adding the --coverage certain flag to FFLAGS/CFLAGS in the Travis CI build environment. With this flag enabled, a series of coverage files are generated when the test suite is run. These files are inspected afterwards with gcov.

eclare108213 commented 6 years ago

I've suggested some namelist combinations at https://docs.google.com/spreadsheets/d/1dMzOC2GAsGqXJvs7xHybCQkPRneeUm78Ah1OXp34bAQ/edit?usp=sharing

Column B is the default ice_in values, and columns C through I are my proposed groupings. There is some overlap, but I don’t think that’s a bad thing. The remaining columns are existing options, except M (gx1prod, for production) which I’d like to add to the repo, and S ("individual options") which is exactly that — I didn’t want to make a separate column for each one of them.

The items in red need to be created or changed from what we currently have in cice. (And we can do things differently from what I suggest here.)

I have not included any zbgc options yet. I think @njeffery is working on that. I’m hoping they can be included in columns F through I, but maybe not because zbgc is sensitive to thermo choices. We might be able to do the skeletal-layer BGC tests in F or I.

I have not included history options other than the ones that are already being tested.

There are a few other items that aren’t covered, and I’m not sure they all need to be. Let’s see what @anders-dc's codecov comes back with. Also, I did not include the hundreds of numerical parameters in the list, only a few that are relatively new or need to be set with a particular flag, or that control whole parameterizations.

As with Icepack, we’ll probably find that some of the options do not work together, but this is a start. Also, I noticed in some of the set_nml.* files that there are namelist entries which aren’t available (yet) for CICE. I would expect those to cause the code to crash! Or the code might be doing a default rather than what we think it is…

proteanplanet commented 6 years ago

Thanks Elizabeth,

This looks great. May I suggest that in the initial phase, the tests are run with every imaginable compiler check switched on, so as to catch problems with incompatible namelist options. That’s how I have detected some in the past, including checks on uninitiated variables being used.

Andrew

On Apr 21, 2018, at 7:07 PM, Elizabeth Hunke notifications@github.com wrote:

I've suggested some namelist combinations at https://docs.google.com/spreadsheets/d/1dMzOC2GAsGqXJvs7xHybCQkPRneeUm78Ah1OXp34bAQ/edit?usp=sharing https://docs.google.com/spreadsheets/d/1dMzOC2GAsGqXJvs7xHybCQkPRneeUm78Ah1OXp34bAQ/edit?usp=sharing Column B is the default ice_in values, and columns C through I are my proposed groupings. There is some overlap, but I don’t think that’s a bad thing. The remaining columns are existing options, except M (gx1prod, for production) which I’d like to add to the repo, and S ("individual options") which is exactly that — I didn’t want to make a separate column for each one of them.

The items in red need to be created or changed from what we currently have in cice. (And we can do things differently from what I suggest here.)

I have not included any zbgc options yet. I think @njeffery https://github.com/njeffery is working on that. I’m hoping they can be included in columns F through I, but maybe not because zbgc is sensitive to thermo choices. We might be able to do the skeletal-layer BGC tests in F or I.

I have not included history options other than the ones that are already being tested.

There are a few other items that aren’t covered, and I’m not sure they all need to be. Let’s see what @anders-dc https://github.com/anders-dc's codecov comes back with. Also, I did not include the hundreds of numerical parameters in the list, only a few that are relatively new or need to be set with a particular flag, or that control whole parameterizations.

As with Icepack, we’ll probably find that some of the options do not work together, but this is a start. Also, I noticed in some of the set_nml.* files that there are namelist entries which aren’t available (yet) for CICE. I would expect those to cause the code to crash! Or the code might be doing a default rather than what we think it is…

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/CICE-Consortium/CICE/issues/89#issuecomment-383348800, or mute the thread https://github.com/notifications/unsubscribe-auth/ANtoUir5hgPFFlrFF7lDWyXNLkhUZbs6ks5tq-XagaJpZM4SRZ7M.

apcraig commented 6 years ago

@eclare108213, I will try to put some of these new options and new tests in place. I think a general question is do we want to create individual options to turn on various specific pieces and then have the test be dyneap_tdedd_kdyn1cbubbly.. or do we want to have some the individual options (which I think are generally useful), but then create options called testA, testB, testC (or some other name) which would group many of the settings? Any opinions?

apcraig commented 6 years ago

Just to add quickly. The testA, testB, testC thing is quite convenient but you can't really tell from the testname what is being tested. While the long, multiple option way of describing a test provides some greater insight into the test when results are posted.

eclare108213 commented 6 years ago

I agree, but the names get really long. We should keep small collections of namelist combinations that need to all be changed together, for everyone's convenience. I'm not sure we need every individual change in a separate file, but I'm not opposed to that. I was thinking that our tests might be named something more useful, even if they don't include every item in the name. I'm traveling for the next few days but hope to have time to look at this. In the meantime, please do whatever you think makes most sense...

apcraig commented 6 years ago

I agree than an optimal strategy might be somewhere in between. Group multiple namelist settings together to some degree then setup tests with may 3-4 options. Maybe we can group them by dynamics, physics, and a couple other groups? Another idea might be to setup options files that contain a list of other files. So files.testA might contain a list of set_nml and set_env files or something. It would be pretty easy to get the scripts to do this. Then, at least the options can be set in just one place and other "higher level" settings can pick those up. I will play around with these ideas and see what seems to work. We can continue to evolve the implementation until we find something we like.

eclare108213 commented 6 years ago

Looking through the alt01, alt02 etc options in the spreadsheet, I realize that alt01 is not consistent in that kdyn=0 (off) and basalstress and formdrag are on. It's not a bad idea to test that kind of combination, but it's not a useful test for basalstress and formdrag. basalstress=T also for alt03, so maybe formdrag should be moved to that one too.

apcraig commented 6 years ago

I have setup some new options in the scripts and am testing the "new" alt01 now. I can't get it to run. It blows up. I have tried a few different combinations of namelist. even just setting NICELYR=1, kcatbound=1, kitd=0, ktherm=0 with no pond tracers results in an abort. I think it might take some careful testing to come up with some cases that actually do run. Maybe what I can do is setup the test configurations as described in the document in the CICE scripts and create a PR, even if they fail. And then we can all work thru the various configurations to get some reasonable configurations working?

eclare108213 commented 6 years ago

Are you starting from the restart file? If so, try starting from an ice-free initial condition (I think ice_ic='none'). If the ITD or thermodynamics is different, it will blow up using the restart. If no ice grows because of the forcing, we'll need to fix that...

phil-blain commented 5 years ago

I just want to bump this, I think having an easy way to visualize coverage is very important. I just noticed that the routines 'global_sub' and 'global_sum_prod' in ice_global_reductions.F90 are never called in the code with the 'lmask' argument present... For cross-reference purposes : see CICE-Consortium/Icepack#202

apcraig commented 5 years ago

Thanks @phil-blain, this is on my task list and I have tended to defer it. I will make some time in the next month to have a look. I agree it's something that should have been setup months ago.

apcraig commented 4 years ago

A code coverage capability has been added to Icepack and CICE with PRs https://github.com/CICE-Consortium/Icepack/pull/313 and #434. The results can be seen here,

https://codecov.io/gh/CICE-Consortium/Icepack https://codecov.io/gh/apcraig/Test_CICE_Icepack

The CICE results are from a separate repository (apcraig/Test_CICE_Icepack) because submodules are not supported in codecov.io analysis, so a special repository is created for coverage testing that incorporates icepack not as a submodule. This is covered in the documentation.

For now, I think we will run the code coverage testing as needed (maybe every few months) to assess coverage and update test suites.

Both Icepack and CICE contain a badge in their README.md files that point to the coverage results and that also seems to be working.

I will be making a revision to the documentation, but believe we can close this now.

phil-blain commented 4 years ago

@apcraig why not create the combined repo in the CICE-Consortium organization ? it would appear more "official" no ?

apcraig commented 4 years ago

My worry is that the community might use it for other things. Eventually, I hope codecov will support submodules and then we can test directly out of the Consortium CICE repo. Until then, I prefer this "fake" repo be hidden from the community and not appear in the Consortium organization. If others disagree with this strategy, we could easily migrate the test_cice_icepack repo.

CICE-Consortium / CICE

code coverage by test suites #89