Closed HedvigS closed 1 year ago
Some remarks on the test code:
tests/testthat
.tests/testthat/test-binarise.R
.Structure the test code in these little blocks:
test_that("multistate features are being binarised", {
data_with_multistate_features <- ...
outcome <- binarise(data_with_multistate_features) %>% ...
expected <- ...
expect_equal(outcome, expected)
})
test_that("binarised features are not being overwritten", {
data_with_binarised_features <- ...
outcome <- binarise(data_with_binarised_features) %>% ...
expected <- ...
expect_equal(outcome, expected)
})
(like this, basically)
If you do that you can just hit a key combination in RStudio (Cmd+Shift+T, apparently) and it will automatically find all your tests and run them. You'll get a nice summary in the Build tab:
ℹ Testing rgrambank
[...]
✔ | F W S OK | Context
✔ | 2 | db
══ Results ═════════════════════════════════════════════
[ FAIL 0 | WARN 0 | SKIP 0 | PASS 2 ]
Advantages:
You can get into a really nice workflow like ‘Change code – run tests – fix code – run tests – add test – run tests – fix code – run tests’ etc.
Github can be set up to run tests automatically every time someone pushes to master or submits a PR, which is reeeally nice (we do that for our python programs and I assume it's probably possible with R as well).
Also, it's good practice to have your tests be as isolated as possible from the ‘messy real world’, i.e. not relying on a working internet connection or a remote file not being deleted:
It might be better to read example data from a local file in the git repo or to manually create the data in the test program itself, like what you did with fake_raw_binary_data
below (that's usually what I prefer):
That will make it so your tests only fail (or stop failling) because of changes to your code, not because of outside factors like working on a train ride.
Some remarks on the test code:
- Put your test files into
tests/testthat
.- Name them something like
tests/testthat/test-binarise.R
.Structure the test code in these little blocks:
test_that("multistate features are being binarised", { data_with_multistate_features <- ... outcome <- binarise(data_with_multistate_features) %>% ... expected <- ... expect_equal(outcome, expected) }) test_that("binarised features are not being overwritten", { data_with_binarised_features <- ... outcome <- binarise(data_with_binarised_features) %>% ... expected <- ... expect_equal(outcome, expected) })
(like this, basically)
If you do that you can just hit a key combination in RStudio (Cmd+Shift+T, apparently) and it will automatically find all your tests and run them. You'll get a nice summary in the Build tab:
ℹ Testing rgrambank [...] ✔ | F W S OK | Context ✔ | 2 | db ══ Results ═════════════════════════════════════════════ [ FAIL 0 | WARN 0 | SKIP 0 | PASS 2 ]
Advantages:
- You can get into a really nice workflow like ‘Change code – run tests – fix code – run tests – add test – run tests – fix code – run tests’ etc.
- Github can be set up to run tests automatically every time someone pushes to master or submits a PR, which is reeeally nice (we do that for our python programs and I assume it's probably possible with R as well).
Also, it's good practice to have your tests be as isolated as possible from the ‘messy real world’, i.e. not relying on a working internet connection or a remote file not being deleted:
It might be better to read example data from a local file in the git repo or to manually create the data in the test program itself, like what you did with
fake_raw_binary_data
below (that's usually what I prefer):That will make it so your tests only fail (or stop failling) because of changes to your code, not because of outside factors like working on a train ride.
Sure, I can do that if we want that. I just wrote these scripts quickly for myself without testthat to see that the code worked the way I expected.
@SimonGreenhill do you want all functions to have a formalised testthat workflow as Johannes suggests?
@johenglisch does the binarise-function work the way you expect it too based on our chat the other day?
yes please! it'll make life better in the long run
yes please! it'll make life better in the long run
Alright, I'll try and find time later this week to do this.
I'm working in this branch now. i can't turn this PR into a draft, but if I could I would and turn it back when I'm done :)
alright @johenglisch and @SimonGreenhill I've made a stab at a testthat for binarise.
The tests look good. Just a tiny hint: You don't need to put the library(...)
calls inside the individual test cases.
Like, as a general rule it's very good that the tests don't share any variables etc. among each other. We like every test to fail/succeed on its own, independent of other test code. But there are things that are considered ‘fair game’ for tests to share (esp. library imports and the occasional constant):
library(...)
library(...)
test_that("...", {
...
})
test_that("...", {
...
})
test_that("...", {
...
})
It just makes writing/changing tests a bit less tedious for you.
The tests look good. Just a tiny hint: You don't need to put the
library(...)
calls inside the individual test cases.Like, as a general rule it's very good that the tests don't share any variables etc. among each other. We like every test to fail/succeed on its own, independent of other test code. But there are things that are considered ‘fair game’ for tests to share (esp. library imports and the occasional constant):
library(...) library(...) test_that("...", { ... }) test_that("...", { ... }) test_that("...", { ... })
It just makes writing/changing tests a bit less tedious for you.
Thanks! I had those in because they were necessary when I was trial and erroring at one stage, but of course happy to remove now.
@johenglisch does the binarise-function work the way you expect it too based on our chat the other day?
@johenglisch besides the tests, does the function work the way you expect?
does the function work the way you expect?
These tests here look exactly like your notes on the whiteboard, so I'd say it's working:
@SimonGreenhill I'm gonna go ahead and re-write binarise now as per your email
i removed the need for stringr
@SimonGreenhill I'm done with tests/testthat/test-binarise.R
and R/binarise.R
.
Turning to language_level_df now
@SimonGreenhill I suspect something unfortunate has happened, because I made changes to language_level_df to take only long tables this week, but I think what got merged into here doesn't reflect that. i'm gonna do some git detective work and double check things.
@SimonGreenhill I suspect something unfortunate has happened, because I made changes to language_level_df to take only long tables this week, but I think what got merged into here doesn't reflect that. i'm gonna do some git detective work and double check things.
Yep, I can confirm the language_level_df script in this branch is older than in the branch in #3 . I'll go ahead and do the specific pull ins.
Don't think I changed anything vital in that file beyond adding comments so you should be ok to just overwrite
@SimonGreenhill I have moved content over from #3 so that the function language_level_df again operates with long tables rather than wide. I think I've addressed all of your feedback on that function now and I made some other smaller improvements.
@SimonGreenhill binarise and language_level_df are ready for review again.
Function that binarises grambank data correctly from a cldf values table