Automated tests - Githubissues

seabbs commented 5 years ago

As part of this review openjournals/joss-reviews#1434

Looking at the testing script in the readme it looks like the code used in the vignette (I may be wrong about that!). I ran it through but as the expected output is not listed it is difficult to know if everything is working as expected!

Automated tests can be a great framework for

Writing tests that others can use
Catching errors as they slip in
Establishing if all of the software is being tested.

I would really like to see a suite of testthat tests (adapted from the current testing script), ideally, run on each commit using something like travis. A test coverage report from codecov would also be great.

To get all this working you need to do the following,

Look at the testthat documentation and write unit tests for each function (based on your current testing script). See here (😸) for some examples. These can then be run from the package root using devtools::test().
Set up travis (optionally also appveyor for windows) to run these tests - using usethis. See here for an example.
Set up codecov to generate reports - usethis. See here for an example.
Add markdown badges with test status and coverage to your README. See here for an example.
Get all tests passing and code coverage to >80%.

fauvernierma commented 5 years ago

@seabbs, I have tried the testhat approach and although I found it interesting, when it comes to survPen, I have two main concerns with it: • I get the interest of testhat when the tests are created as the code is being developed but now that the package is fully operational, it feels a bit unnatural to try and develop the tests • In the survPen package, I basically have one function: survPen. Therefore, it feels quite clumsy to try and test any little details of each inner function

Would it not be more appropriate to a text file with the expected output and pdfs for expected graphs? This is what I did here: https://github.com/fauvernierma/survPen/tree/master/tests

Thank you

seabbs commented 5 years ago

The updated tests look improved but I don't think they are sufficient.

These are the JOSS guidelines:

Good: An automated test suite hooked up to an external service such as Travis-CI or similar OK: Documented manual steps that can be followed to objectively check the expected functionality of the software (e.g. a sample input file to assert behaviour) Bad (not acceptable): No way for you the reviewer to objectively assess whether the software works

As a minimum, I need to see considerably more documentation in the current testing script. This needs to outline:

What is being tested.
What is the expected result.
What is the actual result.
What functionality is not tested.

The advantages of automated unit tests for both myself and your future users are as follows,

Can assess what is being tested and what is not being tested. i.e how confident are we that everything works as stated.
Using CI lets the current status of tests be quickly evaluated. As it currently stands all users must run your tests for themselves and compare.
Testing each small section of your package builds confidence in the greater whole. What happens if there is a small error in one of your sub-functions that causes slight numerical discrepancies?
Shows that software best practises are being followed meaning that this package is worth spending time to learn.

For the developer

Adding a test when bugs are fixed prevents those bugs from reoccurring.
Can add new features without accidentally breaking old features.
Helps think directly about how each feature actually works and checks gives the results your expect.
Allows areas of code that are not being tested to be identified.

I realise that proper testing seems like a lot of work for nothing but it will definitely pay itself off in the terms of user trust, and reducing any future development time. As the JOSS guidelines state adequate manual tests will be acceptable but personally I would be very wary of any package without proper tests if it did not already have a large user base (i.e mgcv).

corybrunson commented 5 years ago

I'd like to second @seabbs's previous comment. I'm able to manually run the test code and compare it to the provided output and PDFs, but it's a time-consuming and error-prone process.

While using a testing package and continuous integration service is optional, they make the adding and editing of tests much more efficient.

As an aside, whether to include tests for the helper functions may depend on whether they are intended to be user-facing (see #14).

fauvernierma commented 5 years ago

Ok, thank you both

I implemented some tests using testthat (for now it is pretty basics stuff but it covers the majority of the code) and used travis and codecov to get badges in the README.

Do you think it looks ok ?

Also, to use testhtat with travis I had to add testthat as an import in the description file. Is there any other way ? Typically, when I look at your example code with getTBinR @seabbs , you seem to have done it differently.

Thanks again

corybrunson commented 5 years ago

@fauvernierma thanks very much for the update.

The new tests confirm that the package is producing the expected kinds of objects (or throws errors when appropriate), and in some key places (excess versus total hazard, concordant penalized likelihoods) confirm that the results are meaningful. They look reasonable to me.

The advantage of producing full output tests like those in test_survPen.r is that the complete output can be inspected and compared to expected results. It seems appropriate to keep this file as well. Alternatively, as @seabbs suggested, these calculations could be made into testthat scripts that compare freshly-calculated values to "known" values. (For reference, the visual tests could be automated using vdiffr, for example, though that'd take some additional getting used to.)

I'm OK either way, since the package now contains both some basic behavioral tests and some manual calculative tests. I'd recommend eventually automating the calculative tests. Though i'd defer to @seabbs's more extensive experience!

Regarding dependency, i've put testthat in Suggests rather than Imports, which should be enough for Travis-CI to attach it when running the tests.

seabbs commented 5 years ago

These look much better @fauvernierma - good job. 👍

I am also okay with just the base functionality having unit tests with overall functionality having manual tests. It would be great if you could completely automate testing but this isn't a blocker for acceptance for me (as @corybrunson says doing it eventually would be good)

As @corybrunson suggests adding testthat (and any other dev package) to suggests is the way to go.

As a final part of testing could you expand your travis grid (and potentially use Appveyor as well) to include more versions of R and other platforms (i.e Mac/Windows). See here for an example.

fauvernierma commented 5 years ago

I added testthat in Suggests 416a58f5f8b604e0e1cf624324c5da274f631ebb and updated the travis.yml file 49a7ec2170b31cc54dba1a7c6dfe981a10be49c1

I think it's ok but I am not sure that the packages has been tested on Mac OS. Despite the travis.yml file, it looks like only linux has been taken into account.

Anyway, I guess this test answers another question relative to the R version that survPen actually neeeds (R version 3.2 seems to be a good minimum prerequisite)

Thanks to both of you and have a good day

seabbs commented 5 years ago

Thanks for adding that.

So the default OS on Travis is Linux so you don't need to specify that in the OS section. I'm not an expert in configuring travis but I think if you put os: osx that should work. You should see an apple in the travis build list.

It would be great to see the package tested against Windows as well. I think travis now supports this but I generally use Appveyor.

Yes, that is one of the handy benefits! Generally, I follow the tidyverse approach and drop the oldest R version from the testing grid as newer version come out (e.g I recently went from 3.2 -> 3.3 when 3.6 came out).

fauvernierma commented 5 years ago

Hi, concerning osx it is all ok 4331167ec22ee1c961410c04b510b3038cbc7d97

However, when it comes to windows this is a bit more tricky as R software is not yet available on Travis I tried to use Appveyor but I am a bit struggling with it. I am not even sure how it works and how to copy and paste someone else's yml file. Do you happen to have a minimum example .yml file ?

Thanks a lot

seabbs commented 5 years ago

Yes try using this one: https://github.com/seabbs/getTBinR/blob/master/appveyor.yml

I'd be very keen on the unit tests being expanded to be more meaningful + less automated. Looking at them it looks like quite a few of them are just checking the class etc. which isn't really testing anything (I may be wrong here :)).

fauvernierma commented 5 years ago

ok thanks

It seems to be ok e9aa9a6bf8d3db6b2a6d907d2bf2a757d8e9f802

I have also added examples for all functions but three of them (inv.repam, cor.var and print.summary, these three are therefore not exported any longer)

I will work on the new examples to provide better tests

Thanks

fauvernierma commented 5 years ago

I added more tests mostly concerning errors b871687e7a4af7c6b27d023b327980430778c243, the internal Newton algorithms 6c04e755dbe531154d1b33ebade959c65d9b0d0a and some expected prediction values aabfb8b4b91149c720ea6fc95c0eefc6b769c5431

Hopefully this is closer to what you expected. But don't hesitate to tell me if it is not :)

Thank you

corybrunson commented 5 years ago

The testthat tests now test the structure of the output and the behavior of print() and summary() methods as well as classes, but also—most importantly for me—some of the numerical results (namely, the predictions). It would be good to also test the coefficient estimates, as was done by the original test files, but i think i'm morally satisfied with the current tests. Thank you @fauvernierma for adding these!

@seabbs it may be that statistical model-fitting software really should include specific tests that aren't here yet, and i'd like to defer to you on that—this also being an issue you originally raised, of course. : )

seabbs commented 5 years ago

Thanks for adding those @fauvernierma - much improved.

My only remaining concern is that so many of the tests are just checking inherited class. I may be wrong but is that not just checking that any given function runs without an error?

The issue with this is that it inflates the test coverage and won't protect you (or other developers) should something change down the line. Personally, I would like to see these tests dropped/replaced with some more meaningful tests. What do you think @corybrunson

Otherwise, I am happy with the current state of the tests - good job. Adding more tests would be great (really encourage you to add as many as possible either now or as development continues :smile:) but the current set is much more thorough than many packages have.

Depending on what @corybrunson thinks about class tests I am happy to consider this done. 👍

fauvernierma commented 5 years ago

Hi @seabbs, you are right, the inherited class test is a way to check if an error occurred.

They are clearly not perfect that's why I added the prediction test file which is much more meaningful. And of course we agree on the fact that this is still an ongoing process :)

seabbs commented 5 years ago

Do you think the current testing suite is sufficient @corybrunson? I noted my concerns above but am happy to close this and leave them to later updates if you think that makes sense?

corybrunson commented 5 years ago

@seabbs thanks for checking. : ) I think my take on this issue is the same as on the performance demo, in that i have some suggestions for improvements but that they rise to a level slightly higher than the acceptance requirements. The real demonstration of a package's utility is that people use it, and i'll bet that use cases will more precisely indicate the sorts of tests that are needed.

seabbs commented 5 years ago

No problem! I agree, nice to have but not essential here.

@fauvernierma I would really recommend taking another look at your tests when and if you have a chance. Still, these are much improved so good job 😄

I am happy to close (and will do so).

fauvernierma / survPen

Automated tests #8