Tests - Githubissues

lauken13 commented 3 years ago

Survey Data

[x] error occurs if adding extra survey data that has less rows than the main data
[x] warning occurs if adding a column that has a duplicate name
[x] error occurs if there is an input dimension mismatch when creating a new survey data obj
[x] error occurs if survey weights do not match the number of rows in survey data
[ ] error if survey design is not specified in a survey package consistent way
[x] error if weights with NAs are given
[x] error if not same number of questions and responses
[x] error if provided responses don't correspond to actual factor levels in the data
[x] print method produces expected output

SurveyQuestion

[x] error if 'name' specified incorrectly
[x] error if 'col_names' specified incorrectly
[x] error if 'values_map' specified incorrectly
[x] warning if duplicated values in 'values_map'
[x] error if not a many to one mapping
[x] print method produces expected output

SurveyFit

[x] population_predict returns object with the right structure
[x] a custom population_predict method works properly
[x] error if a custom population_predict method is required but not provided
[x] plot method works properly and that appearance doesn't change (note: done in PR #75)
[x] aggregate method returns objects with correct structure (with and without 'by')
[ ] poststratification error is given if poststrat table hasn't been created (this one is hard to check in aggregate because a postatrat matrix is needed to fit a model)
[x] predictions for a given model (brms, rstanarm, lme4) match a presaved benchmark
[x] warning if fitting with functionality other than lme4, brms, rstanarm (because custom predicition method will be needed
[x] error if sim_posterior_epred() is used with a non-glmerMod
[x] error if poststrat table hasn't been created yet
[ ] print method produces expected output
[x] fit method returns fitted model object
[x] map method returns map object

SurveyMap

[x] error if samp obj is not a SurveyData object
[x] error if popn_obj is not SurveyData object
[x] add method throws error if existing column has the same name
[ ] print method does not fail if there are unlinked/mapped columns
[x] print method produces expected output (note: done in PR #72) Check for a different scenarios that new data has the correct levels:
[x] Svy: 18-25, 26-45, 46; Popn 18 - 45, 46+
[x] Svy: 18-25, 26+; Popn 18-25, 26-34, 35+
[ ] Svy: 18-25, 26-30, 31- 40, 41-55, 56+; 18- 25, 26-35,36 - 45, 46-55, 56+ Check that a subset of levels in the raw sample data matches the new sample data
[x] Svy: 18-25, 26-45, 46; Popn 18 - 45, 46+
[x] Svy: 18-25, 26+; Popn 18-25, 26-34, 35+
[ ] Svy: 18-25, 26-30, 31- 40, 41-55, 56+; 18- 25, 26-35,36 - 45, 46-55, 56+ Check that a subset of levels in the raw population data matches the new population data
[x] Svy: 18-25, 26-45, 46; Popn 18 - 45, 46+
[x] Svy: 18-25, 26+; Popn 18-25, 26-34, 35+
[ ] Svy: 18-25, 26-30, 31- 40, 41-55, 56+; 18- 25, 26-35,36 - 45, 46-55, 56+
[x] error if model fit is other than binomial/bernoulli family
[x] error if variables in formula aren't in data
[x] error if data hasn't been mapped yet
[x] error if data is given as an argument
[x] fit errors if the rhs of the model are not column names of the poststrat matrix (#20)
[x] stan_glmer runs with binomial formulation
[x] stan_glm runs with binomial formulation
[x] brm runs with binomial formulation
[x] lme4 runs with binomial formulation
[x] custom model fitting method works

Other validation

[ ] validate function fails when map name isn't in population
[ ] validate function fails when map name isn't in the sample
[ ] validate function throws error when map levels aren't in the population
[ ] validate function throws error when map levels aren't in the sample
[ ] warning if all of the sample data variables are in the population

End to end:

[ ] no warnings are outputted in an end to end run of the code with lme4
[ ] no warnings are outputted in an end to end run of the code with brms
[ ] no warnings are outputted in an end to end run of the code with rstanarm
[ ] example code MRP estimate is close to truth using lme4
[ ] example code MRP estimate is close to truth using brms
[ ] example code MRP estimate is close to truth using rstanarm

lauken13 commented 3 years ago

Not a comprehensive list - please feel free to add as you think of things

lauken13 commented 3 years ago

@Jonah will add the files for the restructure.

20/04 - Before the next meeting write one test each @lauken13 @mitzimorris @Dewi-Amaliah @RohanAlexander @jgabry

jgabry commented 3 years ago

We now have a branch called "tests" where everyone can put the tests they are adding

jgabry commented 3 years ago

I edited the checklist to remove the ones for checking that stan and lme4 warnings are printed. I think we can safely assume that they will be printed. Also, because the warnings from Stan are not deterministic it can be difficult to test (e.g., with Stan the same seed on mac and windows can result in different numbers of divergences, etc. ).

jgabry commented 3 years ago

@lauken13 For this one

Check bernoulli and binomial model fits in rstanarm model produces fits that are within tolerance

what does "within tolerance" mean in this case?

jgabry commented 3 years ago

Another question: for this one

Check that an error occurs if adding extra survey data that has less rows than the main data

there's no method for adding more data to a SurveyData object once it's created (which seems fine, it can just be recreated with more data if desired), so I'm not sure what this is referring to. Or did I misunderstand the purpose of this one?

jgabry commented 3 years ago

Another thing that occurs to me is that if we use "suggested" packages like rstanarm in our tests we need to use them conditionally (i.e., make sure they're installed) or CRAN will complain.

Inside of an individual test we can use skip_if_not_installed(...) but outside of the tests (e.g. if running models at the top of the file) we'll need to manually check if the package is installed, e.g. using if (!requireNamespace(pkg, quietly = TRUE)) ...

lauken13 commented 3 years ago

@lauken13 For this one

Check bernoulli and binomial model fits in rstanarm model produces fits that are within tolerance

what does "within tolerance" mean in this case?

Maybe this will be captured from other things, I just thought if we have a benchmark for this (and it can be wide) then we might catch variable matching errors from the steps before this. Not so much testing the rstanarm function but as a tool to test that everything beforehand is working as expected.

Another question: for this one

Check that an error occurs if adding extra survey data that has less rows than the main data

there's no method for adding more data to a SurveyData object once it's created (which seems fine, it can just be recreated with more data if desired), so I'm not sure what this is referring to. Or did I misunderstand the purpose of this one?

Isn't add_survey_data_column a public function? Maybe it's not intended for user use thiough?

Another thing that occurs to me is that if we use "suggested" packages like rstanarm in our tests we need to use them conditionally (i.e., make sure they're installed) or CRAN will complain.

Inside of an individual test we can use skip_if_not_installed(...) but outside of the tests (e.g. if running models at the top of the file) we'll need to manually check if the package is installed, e.g. using if (!requireNamespace(pkg, quietly = TRUE)) ...

Yeah that's easy to fix, I'll do that now.

jgabry commented 3 years ago

there's no method for adding more data to a SurveyData object once it's created (which seems fine, it can just be recreated with more data if desired), so I'm not sure what this is referring to. Or did I misunderstand the purpose of this one?

Isn't add_survey_data_column a public function? Maybe it's not intended for user use thiough?

Oh I forgot about that, you're right! When I added that method it was mostly so that the SurveyMap object could modify the SurveyData object if necessary and not really intended for users, but a user could use it if they want so you're right we should test it.

jgabry commented 3 years ago

using if (!requireNamespace(pkg, quietly = TRUE)) ...

oops I meant requireNamespace(...) not !requireNamespace(...), sorry. I just fixed it in the test files.

lauken13 commented 3 years ago

Oops - I think I changed it on my version anyways.

On Thu, Apr 22, 2021 at 9:02 AM Jonah Gabry @.***> wrote:

using if (!requireNamespace(pkg, quietly = TRUE)) ...

oops I meant requireNamespace(...) not !requireNamespace(...), sorry. I just fixed it in the test files.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mitzimorris/mrp-kit/issues/38#issuecomment-824419196, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA5ZMGHQDLESYO3VX772TT3TJ5KQTANCNFSM422QWYJQ .

jgabry commented 3 years ago

I think we should either avoid checking off items until the tests for them pass (instead of just when there's a test) or put a note in the checklist if you add it but it's still failing. There are a bunch of failing tests but it's not clear which are failing for a known reason and which are failing because there's a mistake somewhere (either in the test or in the package).

jgabry commented 3 years ago

[x] Check for error if not a many to one mapping

This test for SurveyQuestion used to pass and is now failing, I think as a result of commit f0eb209d8fa0f16b3c877b31420571557912af33. I'll fix this for now, but going forward we should remember to update any tests affected by changes to the R code. This will be easier when we set up the tests to run automatically, but in the meantime we should manually run the tests when we make changes to the R code.

jgabry commented 3 years ago

Also when adding tests, if the test you add is supposed to fail (e.g., because the necessary code isn't implemented) please note that in a comment next to the test code itself. Otherwise it's super hard to tell which tests need to be fixed and which tests are failing because underlying package code needs to be fixed.

jgabry commented 3 years ago

One more suggestion about tests that's hopefully helpful going forward. There's no need to add a test that only tests that something doesn't error unless the only thing that's possible to test is that it doesn't error. If it's possible to test anything else (e.g. the structure of the returned object, other aspects of the behavior, etc.) then you just test that and those tests will automatically also serve as tests that it doesn't error. To that end, I replaced a bunch of tests in test-SurveyFit.R that were only testing for no errors with tests that check the structure of the returned object (which also therefore automatically test that it doesn't error).

lauken13 / mrpkit

Tests #38