UBC-MDS / software-review-2022

0 stars 0 forks source link

Submission Group 15 - snapedautilityR(R) #46

Open AraiYuno opened 2 years ago

AraiYuno commented 2 years ago

name: snapedautilityR about: snapedautilityR is an open-source library that generate useful function to kickstart EDA (Exploratory Data Analysis) with just a few lines of code.

Submitting Author Name: Kyle Ahn Submitting Author Github Handle: !--author1-->@AraiYuno<!--end-author1-- Other Package Authors Github handles: @harryyikhchan, @dol23asuka Repository: https://github.com/UBC-MDS/snapedautilityR Version submitted: 0.2.0 Submission type: Standard

Editors:

Reviewers:

Package: snapedautilityR
Title: Provide useful EDA functions with just a few lines of code
Version: 0.0.0.9000
Authors@R: c(
      person(given = "Harry",
           family = "Chan",
           role = c("cre"),
           email = "harryyikhchan@gmail.com"),
      person(given = "Dongxiao",
           family = "Li",
           role = c("aut"),
           email = "dol23asuka@gmail.com"),
      person(given = "Kyle",
           family = "Ahn",
           role = c("aut"),
           email = "pyh2982@gmail.com")
      )
Description: snapedautilityR is an open-source library that generates useful function to kickstart EDA (Exploratory Data Analysis) with just a few lines of code. The system is built around quickly analyzing the whole dataset and providing a detailed report with visualization. Its goal is to help quick analysis of feature characteristics, detecting outliers from the observations, and other such data characterization tasks.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.1
Suggests: 
    testthat (>= 3.0.0)
Config/testthat/edition: 3
Imports: 
    tibble,
    ggplot2,
    vdiffr,
    tidyselect,
    cowplot,
    dplyr,
    tidyr,
    palmerpenguins,
    GGally

Scope

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

artanzand commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 1


Review Comments

When I saw that code coverage of 100% I got excited that I will be reviewing a well-put-together package, and I am happy to say that I am still very pleased with the package. It is very light weight and easy to use. Here are some minor comments with room for improvement:

  1. The README says that you are making a violin plot detect_outliers, but both your results and code imply that you have chosen a boxplot. This should be an easy fix.
  2. Noticing that the other two plots have titles, it would nice (necessary as per MDS guidelines) for detect_outliers() to also generate a title.
  3. The installation instructions in README say that "You can install the released version of snapedautilityR from CRAN with install.packages("snapedautilityR"). This script does not work and by checking CRAN I realized this package does not exist in their repository. The statement should be taken out. Note: The installation instructions from Github repo worked perfect.
  4. In Usage section in README you are providing an outdated example which is not supported by your plot_histograms function. The extra arguments should be taken out from plot <- plot_histograms(df, c("species", "bill_length_mm", "island"), 2, 100, 100).
  5. The example for your plot_corr function in the Usage section of README generates a plot but without colors (just labels). Maybe replace it with the one from your Vignette as that one is working fine.
rezam747 commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 1.5


Review Comments

Congrats team on the great development of EDA package in R! The main functions of package are easy e to understand, install, and use. Same as your Python package, this project also performed very well overall, but I do have a few suggestions for improvement :

  1. In your detect_outlier, you mentioned that it returns a violin plot; however, it plots a boxplot.
  2. In the documentation of detect_outlier, the first argument s say that it should be a list of doubles, but users should pass a vector as per your examples. This can be easily fixed by changing the type of s argument to a vector of doubles in the documentation.
  3. In the documentation of plot_histograms, it says that Detect outliers in the given list which I think this description is for the detect_outlier function. In addition, for argument feature the function accepts a vector, but in documentation, it says List of string feature names.
  4. I think that would be better as an EDA package, to have some of your sample output in your usage section of README.md file to engage users with your output.
  5. In the usage section of README.md file plot_distribution has been introduced with 4 arguments but in the article section of the website(vignette), It has been introduced with 2 arguments which I think that just the usage part of README.md file needs to be updated.
iamMoid commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 1.5


Review Comments

Great work on the package guys. One can definitely generate plots within a snap! I reviewed the package and listed below are some suggestions:

  1. The rendered documentation looks good. I noticed that the version next to the title does not match with the current release. Should be an easy fix I suppose. image

  2. In the README, the CONTRIBUTING section has a reference to the guidelines Interested in contributing? Check out the contributing guidelines., however, I do not see the CONTRIBUTING.md file in the package root folder. Will be good to include the file and link it as well in the README.

  3. The 100% code coverage implies you have done a perfect job of thoroughly testing. There are multiple test that functions in each test script which confused me since there is no overall description as to what aspect of the code is it testing. Should be an easy update to include comments for each test that function.

  4. The USAGE section in the README could include sample output plots generated using the examples provided in the code that will help demonstrate the functions.

  5. There are references to violin plots in the detect_outlier function, however, the actual code is for the boxplot. Should be an easy fix.

gfairbro commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing:


Review Comments

Hey team! Nice work. I am late to the game but both my fellow reviewers and myself had to work pretty hard to find issue, which speaks to the good work you all have done. This package is dead easy to use and would save users a nice chunk of time regardless of their project. A few nitpicks below.