Open YueJiangMDSV opened 4 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [ ] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [ ] A short summary describing the high-level functionality of the software
- [ ] Authors: A list of authors with their affiliations
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing:
2 hours
[x] Should the author(s) deem it appropriate, I agree to be acknowledged as a package reviewer ("rev" role) in the package DESCRIPTION file.
test-stat_summary.R
uses arbitrary values as input instead of objects created by your functions. The package would be more stable if you try to test with the output of your function objects as input.
It would be better if the authors can add default arguments to the functions to make them more convenient. samplingsimulatorr
differentiate from the basic function by shortening the process. Thus providing default arguments would add more values to the package.
The input types are not controlled nor specified in functions. For example, argument n_s
in draw_samples()
can accept double values instead of an array. In stats_summary
the argument parameter
the documentation doesn't tell what type of input it takes.
The error messages were unclear to me in general. It would be better if the functions notify the users about the correct type of inputs. I could not figure out how stats_summary()
works, although I followed your examples.
In generate_virtual_pop
documentation, description about dist
seems a bit unclear. For example, dnorm()
, dpois()
will not work although it is a function of a distribution. Specifying a list of possible functions that you can use for dist
in the error message would be super helpful
In draw_samples()
, the names of the columns were a bit confusing. replicate
, size
, rep_size
were not very intuitive. Instead, names such as sample group
, sample size
or using the same argument name reps
and n_s
would be more informative.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [ ] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [ ] A short summary describing the high-level functionality of the software
- [ ] Authors: A list of authors with their affiliations
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing:
Congratulations to all team members on writing this package. Very nice. It reminded me of DSCI 552! You did a good job with writing tests, code coverage and generating the documentation site. Nice touch with the additional Example Usage Scenario section. I've completed my review and have some observations and recommendations to share with you that you may find useful.
1.1.0
but the documemtation show version 0.0.0.9000
. Consider updating the version number in your DESCRIPTION
file.dplyr
is currently at version 1.4.2
so you could test version 1.4.0
and enforce (>= 1.4.0)
or maybe even go back a minor version to (>= 1.3.0)
.?generate_virtual_pop
, the author name is appended to the description of the function. This doesn't seem to be the norm with other packages. I'd consider removing the author line in the source file.plot_sample_hist
: It wasn't clear that this function specifically uses replica 1
for the plot until I looked at the source code. I was curious about this and could not find it in the documentation.After drawing samples with draw_samples
, it was clear what the purpose of the rep_size
column would be used for. I understand that size
can take on different values but it seems that rep_size
will only have a single value equal to reps
. If there is a reason for keeping this column around, perhaps it should be documented.
For the draw_samples
function, the parameter name n_s
takes a while to get used to. I'd suggest a more user-friendly name like sample_size
.
I ran devtools::check()
and noticed the following:
create_sampling_hist: no visible global function definition for ‘quantile’
.
.
.
Consider adding
importFrom("stats", "quantile")
to your NAMESPACE file.
This can be avoided by invoking quantile
in create_sampling_hist
function with stats::
, i.e. stats::quantile
. This will avoid namespace ambiguity and get rid of some of the weird messages. I believe other similar messages are also namespace related.
In test-draw_samples.R
, the very last line test_draw_samples()
is unnecessary and causes the above test to run twice.
draw_samples
: I mistakenly called this function with reps = c(...)
and got back an R warning not generated by the package. An additional check to ensure this parameter is numeric would help fool-proof it from users like me! ;)
create_sample_histogram
: Some comments in the code don't seem to match the actual code:
# for each sample size, generate tidy data needed for histogram plots
: This says it's generating the tidy data but it's actually creating the plots.# create list of sample histograms
: This part only generates the True Population histogram.plot_sampling_hist
reps
parameter is not used; consider removing.I was not able to successfully use stat_summary
. I belive there is something wrong with the code around here:
35: sub_pop <- population[[names(population)]]
36: sub_samples <- samples[[names(samples)]]
I hope these suggestions have been helpful. Congratulations again on a job well done!
Thank you @ryanhomer and @agdal1125 for your reviews and feedback! Please see our release v.2.0.0 for the updates we have made based on your recommendations.
The following changes have been made based on your feedback:
test-stat_summary.R
to use the output of our function objects as inputdraw_samples
and stat_summary
functions to make them more user-friendlygenerate_virtual_pop
to include examples of how to pass in distributions as arguments (for example dnorm
will not work but rnorm
will)reps
parameter from the plot_sampling_hist
function and added importFrom("stats", "quantile")
to the NAMESPACE
file to avoid weird warningsplot_sample_hist
function to explain that it uses replica 1
create_sample_histogram
to match the code, as suggestedn_s
to sample_size
throughout to make it less ambiguous.plot_sampling_hist
function to make sure the input df has the exact column names needed.Thank you again for all your work on reviewing our package. We really appreciate your time and effort!
Submitting Author: Holly Williams(@hwilliams10), Lise Braaten(@lisebraaten), Tao Gup (@tguo9), Yue (Alex) Jiang (@YueJiangMDSV ), Repository: samplingsimulatorr Version submitted: 1.1.0 Editor: Varada Kolhatkar (@kvarada) Reviewer 1: Ryan Homer (@ryanhomer) Reviewer 2: Jaekeun Lee (@agdal1125) Archive: TBD
Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
This package is intended to assist in teaching and/or learning basic statistical inference by allowing users to generate virtual populations to compare and contrast sampling vs sample distributions and parameters.
Who is the target audience and what are scientific applications of this package?
The target audience is instructors and/or students teaching or learning basic statistical inference.
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
To the best of our knowledge, there is currently no existing R package with the specific functionality to create virtual populations and make the specific sample and sampling distributions described above. We do make use of many existing R packages and expand on them to make very specific functions. These include: built-in r distribution functions such as
rnorm
to sample from distributionsrep_sample_n
to generate random samples, andggplot2
to create plotsN/A
Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
JOSS Options
- [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct