Closed hanecakr closed 8 months ago
I think you need to paste this as a reply to this issue: https://github.com/ropensci/software-review/issues/618
I think you need to paste this as a reply to this issue: #618
oops! Can I remove this issue?
I'll just close it, no worries!
Again a big thank you for your review report and time. I was not able to address all issues raised earlier, but have now found some time to work on the package and to provide an answer to your comments and suggestions. I've copy-pasted the review report and inserted my replies below.
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
I have added you as a reviewer in de DESCRIPTION
Documentation
The package includes all the following forms of documentation:
The opening paragraphs of the README are good, and I think that this R package solves a challenging problem, so firstly, well done! I think could be made a little bit clearer in terms of the problem it solves, and the input it takes. While I find the photos useful, it initially made me think that this software takes images as input. I would suggest something more like what is in the vignette to start:
Then, describe the problem you want to solve, which I think is estimating when the timber was cut down. Then show the data, explain what the columns mean, and how this might be a typical example of dated tree-ring series data.
Then show a short example of the output, clearly demonstrating the problem the package solves.
The rest of the first paragraph:
Is important, but I think could go into more of a methods/general introduction part of the README, perhaps further down.
I'm not sure what the images show me, and so to communicate this effectively I think they should contain a caption.
I think the target audience could be more clearly stated in the README. Perhaps at the end of the first paragraph.
README has been rewritten according to comments of both reviews.
The 'Get started' vignette provides more detail and examples.
All installed well for me!
It did run successfully locally!
T
andF
should be specified asTRUE
andFALSE
.Now TRUE and FALSE are used consistently
The examples ran without error, using:
URL
,BugReports
andMaintainer
(which may be autogenerated viaAuthors@R
).There are no community guidelines in the README, I see them in the file:
.github/CONTRIBUTING.md
, but these are not linked to in the README. Once these are linked, e.g., by writing something like:Community guidelines and code of conduct have been added
Functionality
All tests pass - unit tests seem quite good coverage, evaluated using
devtools::test_coverage()
.[ ] Packaging guidelines: The package conforms to the rOpenSci packaging guidelines.
package name passes checks on
available::available("fellingdateR")
I think if possible the author should consider renaming the package to all lowercase,
fellingdater
orfellingdatr
.Not sure what is the best way to do this. Any practical guidelines?
There are other considerations that I think mean it does not currently conform to the rOpenSci packaging guidelines. Rather than discuss them in too much depth here, I will put them in the review section below.
Estimated hours spent reviewing: 5
[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
You have been added as reviewer in the DESCRIPTION
Review Comments
I wanted to open by saying that while I have a lot of feedback, I think that this is a great piece of software that helps solve a tough problem, so well done on the author for writing this! I hope that the feedback is useful 😄 . Please let me know if something is not clear or if you need help implementing these, or further information. Thank you for submitting this software, I enjoyed reviewing it.
General comments
There are a fair few examples from the rOpenSci packaging guide, which I don't think are followed, I have gone through the guide and written some examples here. After the author makes these changes, I would recommend they double check the guide.
Recommend making sure all functions and objects use snake_case.
All code is in snake_case now
Except for the `read_fh()` function. in the fellingdateR package I build upon the originale code of the read.fh function from the dplR package. I would prefer to stay a close as possible to the original code in the dplR package in order to facilitate future cooperation and possible integration of both functions.
argument name uses
x
for most data frame inputs. I would recommend considering naming data thingsdata
or.data
or similar to help distinguish them from a vector,x
. Not required but worth considering, I think.There is some use of
cat
in the package, recommend usingcli
as described in the tidyverse style guide on writing error messages. I expand on this below.cat() no longer used (except for read_fh - see comment above)
Code style is not consistent, there is mixed use of the number of indentations : between 0 and 8 spaces. I would recommend applying the tidyverse style guide to the package with
styler::style_pkg()
Code has been restyled using the styler-package
Indenting code is important but this 8 space indentation is not consistent with other indentation used in your package, and when reading the code gives the impression that the code is happening inside some/several if/else/for control statements. I would recommend applying a style guide such as the tidyverse style guide or similar, to the code, so that indentation is consistent.
Due to indentation, a lot of lines of code go over 80 characters. I think it is worth the time to re-indent, or rewrite some code by using explaining variables, so the code doesn't go over 80 characters
Code has been restyled using the styler-package
=
is sometimes used over<-
- I recommend using<-
consistently.<-
is now used consistently as assignment operatorThere is no top level documentation for
?fellingdateR
- this could be achieved usingusethis::use_package_doc()
.This has been added
package should use a website. See the ropensci guide on building a website
see https://hanecakr.github.io/fellingdateR
internal functions, like
d.dens
andd.count
should have a#' @noRd
tag to mark is as an internal functionThese function now resided in
helper-functions.R
with #' @noRDexamples in code should use all argument parameters
Most examples now include all arguments
Recommend the author reads through the CRAN gotchas
What does the
sw
stand for in things likesw_combine
and co?sw = [s]{.underline}ap[w]{.underline}ood, fd = [f]{.underline}elling [d]{.underline}ate. I've made this more obvious in the README introduction.
Some of the documentation uses reversed backticks, which I haven't seen before, e.g.:
´n_sapwood´ and ´count´
corrected
There are still a few lines of code that don't pass the
goodpractice::gp()
checks. In particular, I think these comments are important:all codes has been styled with the styler-package,
<-
,TRUE
andFALSE
now used consistently, and length of some functions reduced by implementing some helper-functions, e.g. for checking input. Use ofsapply
and1:length()
has been avoided.You interchange between using
=
and<-
in your code. I would recommend using<-
only. See for example incor_table.R
:=
no longer used as assignment operatorError messages. I would recommend building input checking functions to assist in how your write up error messages. There are a few key benefits to this:
cli
to build the error messages allows you to useglue
strings, so you don't have to try and quote or inject other information into the message string, it should be easier to add details you care about.check_input()
is now one of the helper functions inhelper-functions.R.
Error messages are hard to write well, and it's great that you've included some good input checking! I think you could make the error functions a bit better for the user by following the tidyverse style guide on error messages.
explaining variables. I've mention this a few times in the other functions in your package, I think it would be worthwhile searching through your cases of using
if
and if there is a long conditional in there, e.g.,any(pdf_matrix[, 2:length(keycodes) + 1] == 1, na.rm = TRUE))
Then I think it would be worthwhile either writing a small wrapper function to identify this, or wrap that up in an explaining variable.
plot = TRUE
as a function option.ggplot
, and so you can specify anautoplot
method or a separateplot_<function>
command.sw_interval
, the following information should be given from the function:n
,hdi
, and the number of sapwood rings.All numerical information needed to build the plots can be found in the output of the sw_model(), sw_interval(), sw_combine() and sw_sum() functions. Their plot argument defaults to plot = FALSE. So the output of e.g sw_combine(trs_example1) can be used as the input for sw_combine_plot()
consistent file names. Some of the files have camelCase names (
movAv.R
), others are snake_case. I would recommend sticking to a consistent naming scheme, snake_case.all snake_case now
I would try and avoid having
else
statements contain errors/stops/warnings/messages. This is because in order to understand the message at the end, you need to then walk back up through the condition of logic beforehand. The way to avoid this is to clearly state the error condition at the top.Defensive programming has been implemented now, avoiding the use of `else` statements followed by a stop/error-message.
Input checking
I would recommend writing small helpers for input checking, and considering using
cli
to help write error messages, as it means you could transform this:Into:
And that code could look like this:
Similarly,
Could be written as a function:
Admittedly, I do have a strong preference for writing these types of functions, having written about it recently, but I do think that at least using explaining variables, which you've already done in places like:
Are a great idea, and there are a few notable places where that would help make the code a bit easier to read, e.g.,
check_input()
is now one of the helper functions inhelper-functions.R
smaller checks for input values are now available as a helper-function.
Most examples you give above is from the read_fh() function. see my previous motivation why I would like to stay close to the original dplR::read.fh() code.
cor_table.R
Refactoring
values
argument ofcor_table
. There is a lot of input checking for thevalues
argument. I think that things such as :And so on indicate to me that these could be written up as separate functions, which could return a list of their inputs, perhaps. These could then be delivered using
switch
, which I often forget how to use, but it would be something like:Examples should demonstrate all types of the inputs for the function arguments.
parameter `values` was removed from the function. Looking back, this is not an option that would be used frequently., and is certainly not required. Removing it from the function allows to shorten the code a bit, and avoids a lot of the necessary checks.
data.R
I would recommend standardising the dataset names to be all lowercase, so that they are easier to remember. E.g.,
Sohar_2012_FWE_c
becomes:sohar_2012_fwe_c
The datasets include names of authors. The names of the datasets can be easily copied from sw_data_overview()
fd_report.R
I think that
fd_report
could be renamedfelling_report
orfelling_date_report
or similar. Whilefd
is concise, I think it doesn't help facilitate discoverability of the functions.Similar to
cor_table.R
, I think that:Could be rewritten as
check_if_variable_exists()
. Something like:The
check_input
function is now part ofhelper-functions.R
get_header.R
This function should move the
cat
message up the top - and should not usecat
, instead using one of thecli
functions, likecli_abort
.I think you could use
structure
instead of setting attributes to NULL:Although I think that they are functionally the same, so feel free to ignore!
cat() no longer used
hdi
This function uses
=
and<-
- suggest sticking to just<-
=
no longer used, in favour of<-
movAv
I think this starting chunk would be clearer if only
if
and notelse
is used.The stop error can move to the top of this, so we clearly capture if
align
is not "center" or "right" or "left". This makes it easier to understand the conditions of error.I suggest using another explaining variable inside
mean
:As that
mean
statement is a bit involved to unfurl.Similarly, the pattern,
if (edges == "fill") {
and} else if (edges == "nofill") {
should be bundled up into a function and applied withswitch
Checks for edges and fill are now on top of the script. Else statements have been avoided.
read_fh.R
dplR
directly, and as such there are small style changes. I think it is worthwhile updating the code style to fit within your package.header.taken
should beheader_taken
etc.I have found that moving comments either into documentation or into issues to help track them is helpful, but I appreciate that sometimes it is best to leave them in the code, but just something that might be worth thinking about :)
Tidying up the error messages in this function would make some of these nested if/else clauses easier to understand.
This is a pretty massive function, a bit over 1200 lines of code. I would recommend breaking down the steps inside this into smaller functions, as this will make the code easier to reason with and maintain in the future.
In the fellingdateR package I build upon the original code of the read.fh function from the dplR package. I would prefer to stay a close as possible to the original code in the dplR package in order to facilitate future cooperation and possible integration of both functions.
I removed all unnecessary comments as they were highlighting sections where I've made changes to the original code.
dplR::read.fh() concentrates on extracting the measurement data. The fellingdateR::read_fh() function extracts also the descriptive (meta-)data from the HEADER fields in a .fh file. This is not possible with the dplR::read.fh function.
Furthermore the fellingdateR::read_fh function allows to read data in CHRON or HALF-CHRONO format.
read.fh() also throws errors when header fields include Capital letters (depends on the software used to produce the .fh files: TSAP, PAST, ...). read_fh() is case-insensitive
sw_combin_plot.R
This is the first time I've seen
############
comment blocks - I'm all for stylistic choices but I am not sure this is needed, especially if this isn't used in other functions.comment blocks with #### removed
I've not seen this pattern to avoid R CMD Check notes before
My tactic has always been to have a separate definition of these, as answered by Carson Sievert on the posit community paage. I don't think there's anything inherently wrong with that, but I could imagine that in some cases this could accidentally erase inputs. Something to be aware of, perhaps?
When I run devtools::check() I get
assigning NULL to these variables avoids the notes., as described in R Packages (2e) https://r-pkgs.org/package-within.html#delta-a-failed-attempt-at-making-a-package
I am all for using the new base R pipe
|>
- however you need to update your Depends in your DESCRIPTION like so in order to use it, since it only came out in R 4.1.0:This comment should probably live in a github issue or just be removed:
these comments are removed
sw_combine.R
This error should check each of the conditions separately - either it has missing values, or it is not numeric.
A check_input() function (in helper-functions.R) now takes care of the input
sw_data_info.R
I think these error messages would benefit from using
cli
, as discussed above.sw_data_overview.R
This is a nice function to include to facilitate data discovery
sw_interval_plot.R
This code
Could be rewritten as an error function or the condition in
if
could be expressed as a function.sw_interval.R
In the final line of documentation for this function there is a hanging sentence:
Well spotted! Corrected.
sw_model.R
Great to see input checking at the top of the function - I do think these should be rewritten as check input functions.
Helper function
d.count
I think should be put into a separate R file calledutils.R
orhelpers.R
d.count
should useswitch
pattern and pass functions rather than usingif
controls.d.count
should bed_count
check_input() and d_dens() (instead of d.count) are now part of helper-functions.R
sw_sum_plot.R
indentation in this code is not consistent - recommend applying a style guide.
Examples should show different variations possible for function arguments. E.g.,
bar_col
,spline_col
,dot_col
, anddot_size
should all be specified in the examples so the user can see what the input should/could be.examples have been updated with more visibility for the different parameters.
sw_sum.R
See note above on including plots.
tests
testthat::
vdiffr
for testing ggplot plots. See visdat for examples