Open johnwslee opened 2 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 1.5 hours
plot_hist_by_cond
. In contrast, you specify the names of the parameters instead in the markdown text instructions for the functions show_summary_stat
and plot_line_by_date
(i.e. plot_line_by_date(startDate, endDate)
). Not a major issue, but an inconsistency that you may want to address. viridis
colour palette, or simply choose basic colours for the plots? Maybe it's possible to allow the user to filter for a specific region they want to observe in the plot_line_by_date
function? I'm sure your group has thought of some creative implementation ideas and understandably you can only do so much for a package within a given time frame. These are merely features that popped up in my head when playing around with your functions.Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
Documentation
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Functionality
Estimated hours spent reviewing: 1.5 hours
Review Comments
plot_hist_by_cond
but on the y-axis in plot_line_by_date
, which may be confusing to users. The standalone graphs are perfectly understandable, but it would be ideal to have consistency between the two graphs.region = "all"
is an optional argument in the plot_line_by_date
function. The vignette and function documentation show this, but its not clear just from the READMEplot_hist_by_cond
function is easy to use and interpret. Is there a particular reason why the condition argument only accepts Region and Age? Although there are less categories, it could still be informative to plot histograms for Sex or Classification. plot_line_by_date
and show_summary_stat
functions are well written and executed. A future improvement to the plot_line_by_date
function could be to allow the option to plot the total cases in BC over time, instead of plotting separate lines for each region. show_summary_stat
would be even more useful if it accepted an input argument that allowed the user to calculate the summary statistics of a specific subset of the data, such as only for a certain region or age group.Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 2h
I like all the functions in this package and our group is actually written something similar. And this is something suggested by TA when our group is facing the same issue: when one function is working on downloading data, it is not suggested to use this function inside the other function to download data again. Instead, downloading data and saving it as a data frame, and then passing that data frame to do plotting is recommended. I think this might be because we don't want to download data multiple times, which takes space and time.
Noticed that the plot_line_by_date function is a bit hard to understand with many lines, I am wondering if it is possible to use direct labelling. In that case, it is clearer for a user to understand which line represents which region.
As the previous reviewer suggests, the loading of the dplyr package seems weird. I try to load this package only and not dplyr, and all the functions seem to work fine for me (not a full test though). What is the particular reason to load it separately? Maybe specify the reason in the readme, or remove the line?
For a test of plot_line_by_date(1, 31), I was expecting to get an error message of 'startDate' should be character.
However, what I get is this: Error in charToDate(x) : character string is not in a standard unambiguous format
I think this is because this block in plot_line_by_date
:
if (is.na(as.Date(startDate, "%Y-%m-%d")) || is.na(as.Date(endDate, "%Y-%m-%d"))){
stop("Incorrect date format, should be YYYY-MM-DD")
}
is before this:
if (!is.character(startDate)) {
stop("'startDate' should be character.")
}
It tries to convert the date before checking if it is a character or not. So maybe the order should be reverted.
This also results in another concern: Although all of the expect_error lines pass, it might not be the expected error message. I would recommend using this format to check the error messages, so that the previous problem could be noticed.:
expect_error(plot_line_by_date(1, 31), "'startDate' should be character.")
A tiny point: from a user's perspective, I would recommend always including the function parameter names in the README Usage. In that case, users could get some hints from the parameter names; or if they want to switch the order of the parameters, they can easily figure out how to do it. Although they could also get this information from the documentation, they may be confused when the first time they read the README. Again, this is just a very tiny issue.
One additional thing, the copyright holder in the license should be individual names I think, not 'rbccovideda authors'.
Despite some trivial and nitpicky things and some updates needed for testing, I think this package is well done! The functions are pretty clear and useful.
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 2h
This is a very interesting package on trending topics, which can definitely be useful at certain levels. Most of the issues and comments have been addressed by fellow reviewers, as I would try to not repeat what others have mentioned, my apologies if there is any overlapping. Here are some of my comments regarding this package:
show_summary_stat
, constant declaration and validation tests are all fine and good, but the code for getting each summary value might be redundant. In order to follow the DRY principle, it might be better to have a separate helper function within the script and use that to extract each summary statistic value.plot_line_by_date
, very detailed messages for the parameter checking and great organized code, but I am not sure why the mask
variable is needed as it just gets reassigned with the plot_data
variable later, but this is definitely not a major issue though. Another part that I am unsure about is whether the Reported_Date
variable within the plot_data
variable is sorted since to get the min and max date from it, one seems to be assuming it is and the other used the max
function.plot_hist_by_cond
, no problem with the checking and the checking, although I suspect that line 22 is a quick fix for some testing, it is a bit confusing to read and not sure if it is necessary. Implementation wise there might a bit redundant for plotting and using a helper function could be better, but it is debatable since you only have two conditions. However, for future integration, as more conditionals might be added, it can be a good practice to still provide another function handling the plotting and reuse the function.get_data
, a very straightforward function and many functions from the here
package were used. I would suggest that to integrate the function with better flexibility for the users, for example, instead of only allowing them to have an output folder, a clear full or partial path would be helpful and maybe even print out some current working directory information to help them decide what kind of path to store. Another thing I would probably add on to is by the end of the function before the return (well... it's probably for the best practice in R you don't have a return, but again, not a big deal) is to print out a message regarding the storage of the output file, as you are modifying the local directory outside the program.Overall it is a very awesome package, and great work Group 25!
name: rbccovideda about: simple and scalable package for EDA of Covid19 cases in BC
Submitting Author Name: Lianna Hovhannisyan @liannah, John Lee @johnwslee, Vadim Taskaev @vtaskaev1, Vanessa Yuen @imtvwy Submitting Author Github Handle: @johnwslee Other Package Authors Github handles: @liannah, @vtaskaev1, @imtvwy Repository: https://github.com/UBC-MDS/rbccovideda Version submitted: Submission type: Standard Editor: Lianna Hovhannisyan @liannah, John Lee @johnwslee, Vadim Taskaev @vtaskaev1, Vanessa Yuen @imtvwy Reviewers: Simon Guo @y248guo, Kiran Phaterpekar @kphaterp, Rong Li @lirnish, Jessie Wong @jessie14
Language: en
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
Who is the target audience and what are scientific applications of this package?
Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Explain reasons for any
pkgcheck
items which your package is unable to pass.Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
[ ] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct