Open zkamvar opened 5 years ago
Sorry, the review process starts here: https://devguide.ropensci.org/reviewerguide.html
I was thinking it might be best to conduct the code review on the tree
branch? It's up to date with master
and largely consists of the new functions vis_ttree, vis_ggplot and a bunch of helper functions. The big changes to the existing code base are in vis_epicontacts (to ensure consistency with the other plotting functions) and other plotting functions - I can look over all of these. All of the remaining functionality (clustering, etc.) aren't changed. Thoughts?
I don't think that would be a good idea because I think ttree needs its own thorough code review in addition to the current core functionalities of the master branch. It's better to nail down the core functionalities first so that we aren't chasing down bugs while considering new features.
OK fair enough let's stick with master for now then!
Here is the results of goodpractice:
> goodpractice::gp()
Registered S3 method overwritten by 'httr':
method from
as.character.form_file crul
Preparing: covr
Preparing: cyclocomp
✔ checking for file ‘/tmp/RtmpHcSv9P/remotes442d27d9deba/epicontacts/DESCRIPTION’ ...
─ preparing ‘epicontacts’:
✔ checking DESCRIPTION meta-information
✔ checking vignette meta-information
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘epicontacts_1.1.1.tar.gz’
* installing *source* package ‘epicontacts’ ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (epicontacts)
Adding ‘epicontacts_1.1.1_R_x86_64-pc-linux-gnu.tar.gz’ to the cache
Preparing: description
Preparing: lintr
Preparing: namespace
Preparing: rcmdcheck
── GP epicontacts ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
It is good practice to
✖ write unit tests for all functions, and all package code in general. 86% of code lines are covered by test cases.
R/colors.R:64:NA
R/colors.R:65:NA
R/colors.R:66:NA
R/colors.R:67:NA
R/internals.R:36:NA
... and 87 more lines
✖ omit "Date" in DESCRIPTION. It is not required and it gets invalid quite often. A build date will be added to the package when you perform `R CMD build` on
it.
✖ use '<-' for assignment instead of '='. '<-' is the standard, and R users and developers are used it and it is easier to read your code for them if you use
'<-'.
R/graph3D.R:111:23
R/subset_clusters_by_size.R:68:16
R/subset_clusters_by_size.R:69:16
R/subset_clusters_by_size.R:75:16
R/subset_clusters_by_size.R:81:16
✖ avoid long code lines, it is bad for readability. Also, many people prefer editor windows that are about 80 characters wide. Try make your lines shorter than
80 characters
R/get_clusters.R:3:1
R/get_clusters.R:14:1
R/get_clusters.R:15:1
R/get_clusters.R:18:1
R/get_clusters.R:24:1
... and 31 more lines
✖ avoid sapply(), it is not type safe. It might return a vector, or a list, depending on the input data. Consider using vapply() instead.
R/graph3D.R:146:17
✖ avoid 1:length(...), 1:nrow(...), 1:ncol(...), 1:NROW(...) and 1:NCOL(...) expressions. They are error prone and result 1:0 if the expression on the right
hand side is zero. Use seq_len() or seq_along() instead.
tests/testthat/test_get_clusters.R:115:30
tests/testthat/test_get_clusters.R:136:30
✖ fix this R CMD check NOTE: Namespace in Imports field not imported from: ‘colorspace’ All declared Imports should be used.
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).get_pairwise()
does not describe what it returns. Thus, it's not clear how the serial interval is derived in the first example unless you look at the code.[.epicontacts
: x[k = FALSE, j = FALSE]
removes everything but the ids in the linelist... I don't think this is expectedthin()
documentation doesn't explicitly say what it returns. Additionally, there are no examples.summary.epicontacts()
has documentation, but no examples and it doesn't describe what is summarizedas.igraph()
doesn't work if igraph is not loaded. get_clusters()
this needs to describe how the clusters are obtained.I see the ebola_sim
linelist used quite a lot. It would be good to define a
small test data set with some known pathologies (contacts without corresponding
linelist components, etc) that we can use for testing known quantities. We also
should get rid of the helper that auto-loads outbreaks. A lot of the tests
additionally seem to repeat a lot of the data construction steps (I believe
because they are set up as part of skip_on_cran()
. There is no need to forbid
all tests on CRAN. Some of them are okay to have.
as.igraph.epicontacts()
: OKcolors
: OKclusters
: the first test of as.igraph.epicontacts is unnecessary. I am unsure of what the second test is doing (what is a net node?). Neither of the get_clusters()
tests actually test that the clusters are cromulent. get_degree()
: first test should read "both equals sum of in and out".get_id()
: I think these first tests need some explanation as to what is going on with the validation.get_pairwise()
: The first test needs to be updated to testthat standards. The "different types of attributes" tests need clarification on what they are testing forgraph3D()
: This takes a long time to run. Use a smaller data sethandling
: This needs more stress tests and more descriptive tests. A custom test handler might be useful here.make_epicontacts
: The first test needs to have more description to confirm that the process actually worksplot.epicontacts()
: First test has the expectations commented out. Testing plots are always a bit of a PITA. This could benefit from a smaller dataset. Additionally, you can test against the internal workings of the resulting plot. The error expectation needs a specific error message to test against.print.epicontacts()
: This is effectively useless. It's not tested on travis or CRAN.subset()
: last bit needs to move away from the stored objects. The last test suite should be split up into smaller chunks because it goes all over the place.subset_clusters_by_x
: These need to be more specific and targeted. The clustering bit is the only thing that may be uncertain, so we need to make sure we have a data set that can actually test these. summary()
: Needs to get rid of the stored object comparison and hard-code the expected values. thin()
: OKvis_epicontacts()
: see plot.epicontacts()
I've implemented some changes on a code_review branch (see ea484306d8d582f2a11b6c65f9c256c10bca9e47)
as.igraph.epicontacts
colors
get_clusters
get_degree
vapply
counting approach is very slow – perhaps switch to table
get_id
get_pairwise
rev
argument that calculates to → from graph3D
thin
handling
internals
make_epicontacts
plot
print.epicontacts
print.summary_epicontacts
subset_clusters_by_id
subset_clusters_by_size
subset.epicontacts
summary.epicontacts
thin
vis_epicontacts
test_as.igraph.epicontacts
test_colors
test_get_clusters
test_get_degree
test_get_ID
test_get_pairwise
test_handling
test_make_epicontacts
test_plot.R
test_subset_clusters_by_ids
test_subset_clusters_by_size
test_subset.epicontacts
test_summary.epicontacts
test_thin
test_vis_epicontacts
This is based on the rOpenSci code reviews: https://devguide.ropensci.org/
@zkamvar and @finlaycampbell are assigned.