UBC-MDS / software-review-2023

DSCI 524
0 stars 0 forks source link

Group 03: nameformeR - R #14

Open eyrexh opened 1 year ago

eyrexh commented 1 year ago

name: nameformeR about: A helper python package that can be used to generate names. This could be used to come up with baby names, character names, pseudonyms, etc.


Submitting Author Name: Daniel Cairns, Eyre Hong, Bruce Wu, Zilong Yi Submitting Author Github Handle: @DanielCairns @eyrexh @BruceUBC @ZilongYi

Repository: https://github.com/UBC-MDS/nameformeR Version submitted: v1.1.0 Submission type: Standard Editor: Daniel Cairns (@DanielCairns), Eyre Hong (@eyrexh), Bruce Wu (@BruceUBC), Zilong Yi (@ZilongYi) Reviewers: @ranjitprakash1986, @Althrun-sun, @netsgnut, @kellywujy

Archive: TBD Version accepted: TBD Language: en

Package: nameformeR
Title: What the Package Does (One Line, Title Case)
Version: 1.1.0
Authors@R: c(person("Eyre", "Hong", role = c("aut", "cre"),
                     email = "eyrexh@students.cs.ubc.ca"),
              person("Daniel", "Cairns", role = "aut"),
              person("Bruce", "Wu", role = "aut"),
              person("Zilong", "Yi", role = "aut"))
Description: A helper python package that can be used to generate names based on the dateset.
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.1
Suggests: 
    covr,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0)
Config/testthat/edition: 3
Imports:
    comparator (>= 0.1.2),
    stringr (>= 1.5.0),
    dplyr (>= 1.0.10),
    readr (>= 2.1.3)
VignetteBuilder: knitr

Scope

Technical checks

Confirm each of the following by checking the box.

This package:

Publication options

MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

kellywujy commented 1 year ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 1hr


Review Comments

  1. Similar to the comment I provided for the python version, it would be nice to include an example for the find_old_name() function with sex as filtering, and make the input case insensitive.
  2. I also found it could be more user-friendly to include parameter name in the function example. E.g find_name(sex = "F", init = "A", which provides hint for the users to input required values.
  3. More explanation on how bar= parameter controls the output name would be nice to have.
  4. I would be nice to also include sections on contribution and code of conduct in the README.md file.
  5. It's a very interesting package. Some ideas for future development:
    • allows finding names that start with a combination of letters
    • is it possible to fetch the data for names beyond 2017?
    • looks up names by its origin (e.g. biblical, other language, country of origin)

Great package!

ranjitprakash1986 commented 1 year ago

peer_review_R_feedback1

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: 1


Review Comments

netsgnut commented 1 year ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing: ~ 1.5 hours


Review Comments

(Review based on the latest commit on main as of writing, commit https://github.com/UBC-MDS/nameformeR/commit/50948086ff64a5e6c0a70364d0dd0f1793bfa701)

First, congratulations on the releasing of the package. It is a fun, simple (yet cool!) idea to use this dataset to suggest baby names.

What I like most is the code is clean and neat. I can see that other reviewers have already given excellent comments over the functionality aspect over the package. I would therefore would like to focus solely on the code-side of things instead.

(Some comments are similar to what I have made in the Python package, which can be viewed here.)

1. Consider DRYing your code, instead of Writing Every Time.

An example would be the snippet of loading the CSV dataset:

url <- "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-22/babynames.csv"
data <- readr::read_csv(url,col_types = readr::cols())

This piece of code appears for 4 times. You may consider refactoring by, e.g., wrapping it as an utility function:

load_raw_data <- function () {
    url = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-22/babynames.csv"
    readr::read_csv(url, col_types = readr::cols())
}

While the line count does not seem to decrease much, the real benefit is that when we want to, say, change the upstream dataset which may have a different structure, it would be easier.

2. find_unisex_name has a conflicting Roxygen definition.

The said function is defined as find_unisex_name <- function(bar, limit = 10), but provides the parameters as:

#' @param limit A float controls the minimum proportion of the neutral names in a single year in the dataframe.
#' @param bar The length of the output with a default value of 10.

It would be better to reorder the listed parameters accordingly, with documentation review.

3. find_old_name should have better guard.

I realized this when I tried out the sex parameter in it, but realized that this does not work with lower cases (e.g., 'f' or 'm'), unlike find_name. The offending piece of code is:

  df <- data |> dplyr::filter(tp == {{tp}})
  if (sex == "uni"){
    f <- df |> dplyr::filter(sex=="F") |> dplyr::pull(name)
    m <- df |> dplyr::filter(sex=="M") |> dplyr::pull(name)
    uni_df <- intersect(f,m)
    if (length(uni_df) < limit){
      uni_df
    }else{
      sample(uni_df,size=limit,replace = FALSE)
    }

  }else{
    r <- df |> dplyr::filter(sex=={{sex}}) |> dplyr::pull(name)
    sample(r,size=limit,replace = FALSE)
  }

Similar to the Python counterpart, it can be rewritten to simplify and with added guards. May refer to the Python issue for more details.

4. Consider enhancing your test cases, especially the conditional guards.

Looking for 100% line coverage for all projects is impractical, but it is always beneficial to cover edge cases. For example, currently the line coverage report shows that we are missing 10 lines, which are related to error handling cases (e.g., throwing exception when the type is not expected, out of range). It would be beneficial that this kind of behaviors to be documented in form of a working test case.

5. find_name and find_old_name have different sex definition, which can be confusing.

It might be best if the parameter can be unified. For example:

Overall, I enjoy reading the code of your project, and despite hiccups, your documentation is good. Great work!

(Reviewed using rOpenSci review template

Althrun-sun commented 1 year ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Estimated hours spent reviewing:


Review Comments

  1. Your package is very distinctive, I like your theme very much, I think it will have broad application prospects.
  2. The code is concise and easy to understand, very structured, with the characteristics of software engineering.
  3. There are a lot of commits each time, and various details reflect the level of cooperation of your team.
  4. The package is fully functional, with very distinctive methods and functions, decoupled and modularized.
  5. You guys have made a great package that I think I will use someday!