Improve test coverage - Githubissues

fredrik-johansson commented 1 year ago

A first test coverage report is available here:

This includes coverage from running both make -j check and make check PYTHON=1 (some gr code is currently only tested via python_ctypes).

The report excludes coverage of the test files themselves.

This is with the default test_multiplier setting; it would be interesting to diff the coverage for different multiplier values to identify rarely-reached cases and add specific test inputs for them.

fredrik-johansson commented 1 year ago

Some standout modules with low coverage:

fmpz_lll (this one is a bit scary)
n_poly
fq_nmod_mpoly_factor
fq_zech_mpoly_factor
nmod_mpoly_factor
fmpz_mod_poly_factor
fq_zech_mpoly
gr* (expected since this is a work in progress)
inlines in several header files
arb_hypgeom
ca
padic_poly, padic_mat, nf_elem
arb_calc
acf
printing functions in several modules

Probably the coverage would be higher if we include functions tested via Nemo and Sage. There's also the Calcium test suite in Python that hasn't been merged.

albinahlback commented 1 year ago

All new test coverage can be found here

https://app.codecov.io/gh/flintlib/flint/tree/trunk/

albinahlback commented 11 months ago

We should investigate whether we can exclude certain lines from codecov. For example, exceptions like https://app.codecov.io/gh/flintlib/flint/blob/trunk/src%2Ffmpz%2FCRT.c#L77 should not have to be included into Codecov.

albinahlback commented 11 months ago

Looks like you can use __attribute__((no_instrument_function)) to suppress some of these functions.

albinahlback commented 11 months ago

I did not manage to find use for __attribute__((no_instrument_function)). So, I think we want to do this in several steps.

First, we do not want to include assertions into the coverage, such as flint_printf with flint_abort or flint_throw. With this, this should really highlight what we actually do not check at the moment. I cannot seem to find any resources on how to exclude lines with gcov, so I'm thinking of switching gcov to lcov or gcovr in order to be able to use exclusion markers within the source code.

Second, we have to agree on what functions we check, and which we do not. Those we agree on not checking, we do not include into the coverage. This may only include functions such as flint_abort and so on, but it may include other functions as well.

Third and last, with everything cleaned up, the amount of covered lines should be represented in a better way in each module. With this, I suggest that we start increasing the coverage from the way down, starting with "simpler" modules like ulong_extras, and then working our way up to *_mpoly stuff.

fredrik-johansson commented 11 months ago

I'm against adding clutter to the source code for the benefit of code coverage statistics.

The goal is to test or fix reachable code, which exists in abundance, not chasing after the 0.1% or so of obviously unreachable error-handling code.

Of course, if we had a macro similar to FLINT_ASSERT for always-on error handling and the coverage tool could just be made aware of such a macro, that would be acceptable.

albinahlback commented 11 months ago

I'm against adding clutter to the source code for the benefit of code coverage statistics.

Låter vettigt!

The goal is to test or fix reachable code, which exists in abundance, not chasing after the 0.1% or so of obviously unreachable error-handling code.

It looks like one can use lcov --omit-lines "flint_throw" to not omit covering lines including flint_throw.

fredrik-johansson commented 11 months ago

OK, that sounds like a good solution.

fredrik-johansson commented 11 months ago

BTW, a bunch of code in the gr modules and elsewhere is currently tested via make check PYTHON=1. I would like to have it covered via C test code, but the Python tests are something of an interim solution. We should add that to CI.

albinahlback commented 11 months ago

BTW, a bunch of code in the gr modules and elsewhere is currently tested via make check PYTHON=1. I would like to have it covered via C test code, but the Python tests are something of an interim solution. We should add that to CI.

I haven't tried, but can run make check PYTHON=1 and it will still produce all the intended coverage files?

fredrik-johansson commented 11 months ago

Normally, yes.

(More precisely, you need to run both make check and make check PYTHON=1.)

albinahlback commented 11 months ago

Btw, am I right in that we do not have to free anything in case we throw? At some places we have clears before throwing, which I will now remove.

fredrik-johansson commented 11 months ago

We would ideally always clean up temporary allocations before throwing, as that avoids memory leaks when users do intercept the exceptions. But it's a lot of work to do everywhere.

edgarcosta commented 11 months ago

And thus, we should keep the existing clears.

albinahlback commented 11 months ago

And thus, we should keep the existing clears.

Understood ;-)

albinahlback commented 11 months ago

The following functions in arb_hypgeom are not tested even once:

airy_jet.c
airy_series.c
beta_lower_series.c
chi_series.c
ci_series.c
ei_series.c
erf_series.c
erfc_series.c
erfi_series.c
fresnel_series.c
gamma_lower_series.c (author: Arb authors)
gamma_upper_series.c
li_series.c
shi_series.c
si_series.c

@fredrik-johansson could you implement tests for these?

fredrik-johansson commented 11 months ago

These functions are just the acb versions with acb grep-placed by arb. Rather than duplicating the test code as well, I will convert them to generics.

flintlib / flint

Improve test coverage #1250