Open merveshin opened 4 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [ ] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [ ] A short summary describing the high-level functionality of the software
- [ ] Authors: A list of authors with their affiliations
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing: 2 hours
When I call ?eda, I would like it if you spelled out Exploratory Data Analysis (eda) in the description because someone who is less data science savvy might not know what it stands for. Furthermore, the first google search for eda is “Electricity Distributors Association”.
When I tried:
result <- eda(iris)
result
The plot is blank because the last column in the iris dataset is character data type. A warning message would be helpful stating that it does not work with character data. Note if I call results$stat[[5]]
it works fine.
nurser
package loads rlang
0.4.4 and this is preventing me from loading tidyverse
because it requires rlang
>= 0.4.5. I had to restart my console load tidyverse
first then load nurser
impute_summary()
works very well. It does everything it promises it would do.
preproc()
needs more in its description. In the READme, it states preproc
will “preprocess features”. However, I was unsure what preprocessing meant. After digging into your function, I realized it was just normalizing the numerical columns. I would like to see a little more clarity added to the description.
Your package is passing all your tests when I call devtools::test()
, your coverage is at 100%.
When I call devtools::check()
it wants you to declare ‘magrittr’ in your vignettes.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).For packages co-submitting to JOSS
- [ ] The package has an obvious research application according to JOSS's definition
The package contains a
paper.md
matching JOSS's requirements with:
- [ ] A short summary describing the high-level functionality of the software
- [ ] Authors: A list of authors with their affiliations
- [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
- [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).
Estimated hours spent reviewing:
1 hour
The Eda help docs work well, but when I try to run the example in the help doc I don't get a rendered histogram. It might be good to have to output be a histogram rather than a variable saved as a histogram.
There are also 10 rows of red warnings every time I run the eda
function from the example in docstring. I'm not sure what they imply, but since they are unclear it would be good to either have them give a useful warning or remove them. They are shown below.
eda()
equal signs are used instead of arrows, and since this package is supposed to work with tidyverse in the R ecosystem the example should probably be changed to have arrows. This is shown below:result <- eda(mtcars)
hist_mpg <- result$histograms[[1]]
stats_mpg <- result$stats$mpg
Overall impute_summary
was very well done. The example ran well and all the tables generated were good.
The function example for preproc
is redundant. Renaming result
to processed_X
is not necessary. If you wanted it could be changed to one line processed_X <- preproc(mtcars)
and be done in one line.
The description in Preproc is also no useful and it is not clear exactly what the function does.
Also once again for the example in preproc
arrows should be used instead of equal signs since it fits into the tidyverse ecosystem.
Thank you for your feedback @evelynmoorhouse and in response to your comments:
Addressed
Note | Response |
---|---|
eda warnings |
this has been addressed |
eda correct syntax (<- instead of = ) |
this has been addressed |
prepoc general changes |
Function has been modified |
PR incorporating changes:
New Release with changes: v3.0.0
Not Addressed
Note | Response |
---|---|
eda histogram output |
will be addressed in future iterations |
better function descriptions | descriptions will be updated on an ongoing basis |
Thank you for your feedback @MrThomasPin and in response to your comments:
Addressed
Note | Response |
---|---|
?eda clarification |
Taken into considerations, although anyone using the function would have read the readme and understands what an eda is |
eda blank character data output |
Function now works with character data |
preproc function |
Function has been modified |
PR incorporating changes:
New Release with changes: v3.0.0
Not Addressed
Note | Response |
---|---|
rlang version |
to be addressed in future iterations |
magrittr declaration |
this is just a warning |
name: Submit Software for Review about: Use to submit your Python package for peer review title: '' labels: 1/editor-checks, New Submission! assignees: ''
Submitting Author: Group 24 (@merveshin, @evhend, @elliott-ribner )
Package Name: nurser One-Line Description of Package: An R package for streamlining the front end of the machine learning workflow. Repository Link: https://github.com/UBC-MDS/nurser Version submitted: v2.1.0 Editor: @kvarada
Reviewer 1: @evelynmoorhouse
Reviewer 2: @MrThomasPin
Archive: TBD
Version accepted: TBD
Description
nurser
aims to streamline the front end of the machine learning pipeline by generating descriptive summary tables and figures, various feature imputation summaries, and automating preprocessing. Automated preprocessing detection has been implemented to minimize time and optimize the processing methods used. The functions in nurser were developed to provide useful and informative metrics that are applicable to a wide array of datasets.Scope
* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.
nurser
automates the plotting process and the summary statistics while conducting Exploratory Data Analysis tasks. It will handle the NaN values and preprocess the data including one-hot encoding, scaling, and label encoding.Any person who is interested in analyzing and preprocessing data before running machine learning models.
There are other individual R packages that have some similar functions(
summary
,ggplot
) but the functions contained innurser
combines those function in an elegant way to proceed much analysis easily.@tag
the editor you contacted:Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
JOSS Options
- [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct