ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

baRcodeR Submission #338

Closed yihanwu closed 4 years ago

yihanwu commented 5 years ago

Submitting Author: Yihan Wu (@yihanwu) and Robert Colautti (@ColauttiLab) Repository: yihanwu/baRcodeR Version submitted:0.1.4 Editor: @sckott Reviewer 1: @raynamharris Reviewer 2: @llrs
Archive: TBD
Version accepted: TBD


Package: baRcodeR
Title:  Label Creation for Tracking and Collecting Data from Biological Samples
Version: 0.1.4
Authors@R: 
    c(person(given = "Yihan", 
             family = "Wu", 
             role="aut", 
             email = "yihan.wu@queensu.ca", 
             comment = c(ORCID = "0000-0002-1202-4208")), 
      person(given = "Robert", 
             family = "Colautti", 
             email = "robert.colautti@queensu.ca", 
             role = c("aut", "cre"), 
             comment = c(ORCID = "0000-0003-4213-0711")))
Maintainer: Yihan Wu <yihan.wu@queensu.ca>
Description: Tools to generate unique identifier codes and printable barcoded 
    labels for the management of biological samples. The creation of unique ID 
    codes and printable PDF files can be initiated by standard commands, user 
    prompts, or through a GUI addin for R Studio. Biologically informative codes
    can be included for hierarchically structured sampling designs.
Depends:
    R (>= 3.4.0),
    qrcode
License: file LICENSE
Language: en-CA
Encoding: UTF-8
LazyData: true
Imports:
    DT (>= 0.3),
    grDevices,
    grid,
    miniUI (>= 0.1.1),
    shiny (>= 0.13),
    rstudioapi
Suggests: 
    testthat (>= 2.1.0),
    knitr,
    rmarkdown,
    covr,
    shinytest
RoxygenNote: 6.1.1
VignetteBuilder: knitr
URL: https://github.com/yihanwu/baRcodeR
BugReports: https://github.com/yihanwu/baRcodeR/issues

Scope

Reproducibility: baRcodeR facilitates the creation of unique text identifier (ID) codes and scannable barcodes (linear and 2D) that can be printed onto customized sticker labels with consumer-grade printers to facilitate management, tracking, and data collection involving biological samples. Unique identifier codes and scannable barcode labels are produced via reproducible code which reduce transcription errors that arise with hand-written labels and manual entry of label codes. In contrast to commercial software, no proprietary hardware is required, labels can be generated automatically with biologically informative coding, and labels can be created from the command line.

baRcodeR is designed to minimize bookkeeping errors and to assist researchers who need to track, curate, or measure biological samples. It is particularly well-suited for large collaborative projects with complex sampling design and numerous samples shared among researchers. For example, a large biological research project may have subjects organized as population/treatment-by-family/line-by-time – that is, each subject sampled multiple times and representing a specific family group or genetic line from one or more nested experimental populations or treatments. The custom layout options in baRcodeR allow researchers to rapidly generate meaningful identifier codes that capture the hierarchical sampling structure, and then print customizable labels containing the ID codes as human-readable text with corresponding digital linear or 2D barcodes. The command-line interface also facilitates rapid re-creation of a set of generated identifiers with different sized labels or minor changes to identifier codes – for example, to label multiple tissue collection containers and then later to label test tubes for purified extracts, each with the same basic code with a new code corresponding to sample type.

Presubmission inquiry #336, approved by editor @melvidoni

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options - [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)
MEE Options - [x] The package is novel and will be of interest to the broad readership of the journal. - [x] The manuscript describing the package is no longer than 3000 words. - [x] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

sckott commented 5 years ago

Editor checks:


Editor comments

Thanks for your submission @yihanwu !

Here's the output from goodpractice. You don't need to address these now, it's info for reviewers to use to get started.

── GP baRcodeR ─────────────

It is good practice to

  ✖ write unit tests for all functions, and all package code in general. 94% of code lines are covered by test cases.

    R/hidden_createPDF.R:272:NA
    R/hidden_createPDF.R:307:NA
    R/hidden_createPDF.R:308:NA
    R/hidden_createPDF.R:309:NA
    R/hidden_createPDF.R:310:NA
    ... and 14 more lines

  ✖ write short and simple functions. These functions have high cyclomatic complexity:custom_create_PDF (58).
  ✖ not use "Depends" in DESCRIPTION, as it can cause name clashes, and poor interaction with other packages. Use "Imports" instead.
  ✖ not import packages as a whole, as this can cause name clashes between the imported packages. Instead, import only the specific functions you need.
  ✖ avoid 'T' and 'F', as they are just variables which are set to the logicals 'TRUE' and 'FALSE' by default, but are not reserved words and hence can
    be overwritten by the user.  Hence, one should always use 'TRUE' and 'FALSE' for the logicals.

    R/hidden_createPDF.R:NA:NA
 ───────────────

Seeking reviewers now 🕐


Reviewers:

sckott commented 5 years ago

reviewers are now assigned: @raynamharris & @llrs

yihanwu commented 5 years ago

Great! Thanks @sckott

raynamharris commented 5 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 2 hours


Review Comments

baRcodeR is an R package that reproducibly creates sample labels. It can be run from the RStudio console or from a GUI Addin. This is the first time I have ever used a GUI or Addin from within RStudio, and it worked very well after restarting R twice. I would recommend this package to my colleagues who do molecular work. Below is a list of minor comments and concerns.

baRcodeR:::make_labels()
Registered S3 method overwritten by 'R.oo':
  method        from       
  throw.default R.methodsS3
Loading required package: shiny

Here is the session info from my workflow.

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] baRcodeR_0.1.4 qrcode_0.1.1   shiny_1.3.2   

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2        rstudioapi_0.10   knitr_1.24       
 [4] magrittr_1.5      xtable_1.8-4      R6_2.4.0         
 [7] rlang_0.4.0       stringr_1.4.0     tools_3.6.0      
[10] grid_3.6.0        DT_0.8            xfun_0.9         
[13] R.oo_1.22.0       miniUI_0.1.1.1    htmltools_0.3.6  
[16] crosstalk_1.0.0   yaml_2.2.0        digest_0.6.20    
[19] crayon_1.3.4      later_0.8.0       htmlwidgets_1.3  
[22] R.utils_2.9.0     promises_1.0.1    rsconnect_0.8.13 
[25] evaluate_0.14     mime_0.7          rmarkdown_1.15   
[28] stringi_1.4.3     compiler_3.6.0    R.methodsS3_1.7.1
[31] jsonlite_1.6      httpuv_1.5.1 
sckott commented 5 years ago

thanks so much for your review @raynamharris !

yihanwu commented 5 years ago

Thank you for the review @raynamharris!

llrs commented 5 years ago

Sorry for the delay. I plan to submit mine in a week or so. I already have most of it written but I want to check more the worflow.

raynamharris commented 5 years ago

I was looking over the rOpenSci peer review guidelines today (because they are so good and I wanted to use this as a guide for a different peer review that I'm working on), and I came across this bullet point "If you have your own relevant data/problem, work through it with the package. You may find rough edges and use-cases the author didn’t think about."

In my current research, I always sample three specific biological tissue that that I code "hyp", "pit' and "gon". Is there a to make barcode with strings in addition to number? So, something like animal-01-hyp, animal-01-pit, animal-01-gon would be awesome. If allowing any random string isn't possible, perhaps something, "A", "B", and "C" could be added so that you could generate sample names like animal-01-A, animal-01-B, animal-01-C instead only only animal-01-tissue-01, animal-01-tissue-02, etc?

Does that make sense?

sckott commented 5 years ago

@yihanwu question when you get a chance ^

yihanwu commented 5 years ago

Hi @raynamharris,

It is possible to add a suffix string with the hierarchical labels.

For example, we can do,

animals <- c("animal", 1, 2)
tissues <- c("tissue", 1,3)
uniqID_hier_maker(hierarchy = list(animals, tissues), end = c("hyp", "pit", "gon"))

and the resulting data frame would look like:

label animal tissue
1 animal01-tissue01hyp animal01 tissue01
2 animal01-tissue02pit animal01 tissue02
3 animal01-tissue03gon animal01 tissue03
4 animal02-tissue01hyp animal02 tissue01
5 animal02-tissue02pit animal02 tissue02
6 animal02-tissue03gon animal02 tissue03

The original thought for the hierarchical labels was that tissue types could be coded as numbers but I can see how it would be overly complicated for cases like yours.

Just to confirm, you would want something that could generate:

animal01-hyp
animal01-pit
animal01-gon
animal02-hyp
animal02-pit
animal02-gon
animal03-hyp
 and so on

Here, each base label is repeated three times for the three ending strings.

raynamharris commented 5 years ago

awesome! that's exactly what I was thinking of. thanks for clarifying @yihanwu!

yihanwu commented 5 years ago

On the todo list it goes!

sckott commented 5 years ago

@llrs reminder that your review is due next Tuesday, thanks!

llrs commented 5 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 7-8h


Review Comments

Test installation

First installation took long time (I don't have much of the dependencies of baRcodeR) but the dependencies look right to me.

The second install directly from github worked well and fast now that I already have the dependencies

Check package integrity

There is a note of non standard files found:

 Non-standard files/directories found at top level:
     ‘CODE_OF_CONDUCT.md’ ‘WuEtAl_2019_baRcodeR_MEE.pdf’ ‘docs’

The WuEtAl_2019_baRcodeR_MEE.pdf seems like an article about the program, which might be better on some preprint archive or another repository but it doesn't belong to the package. I would recommend to delete it from the repository.

CODE_OF_CONDUCT.md and the docs folder should be included in .Rbuildignore. Also the docs folder is created to create the webpage of the package. Maybe you can automate this using use_pkgdown_travis() seeing that you already use Travis.

There are few tests, and 2 warnings related to introducing NAs by coercion (test_warnings.R, lines 12 and 13) and one about "Digits specified less than max number. Increasing number of digits." in test-label_generation.R line 13.

To test for warnings remember that there is the expect_warning function in testthat. The other warning should be captured by expect_warning or select a good number to avoid the warning.

uniqID_hier_maker is only tested for warnings and not for expected output.

The shiny app is only tested locally. Maybe this was a recommendation from the CRAN? I don't know in which platform it was tested but when I run it with compareImages = TRUE it failed. If possible I would recommend to test it on CRAN even if it is with compareImages = FALSE.

It reports that the coverage is only 27% I would aim for 90% or more seeing that you want to test also the shiny app and the code.

It also warns about using T and F instead of TRUE and FALSE.

Also it recommends to think about the dependency to qrcodes that is on Depends instead of Imports. I think it is sensible to have it Depends as without it this packages seems useless.

However, goodpractie::gp couldn't continue with the analysis due to some error with linter. Probably this is due to some long string of code.

Check of the metadata

The README doesn't have any badge of status, from repostatus.org. See this section

Most if not all of the results that the spelling check reports are from examples or names. Consider adding them the dictionary so that they are not reported again.

The readme recommends to provide the sessionInfo, you might be interested on the package reprex, which can help your users to create the minimal reproducible example.

The DESCRIPTION lists 6 Imports and 1 Depends but the NAMESPACE only imports two package. The NAMESPACE should import fully the Depends packages and selectively the functions of packages on Imports.

It also uses Authors@R and Maintainer it is better to select just one (less things to keep up to date). Nice addition of the ORCID numbers.

I would recommen to use usethis::use_tidy_description to set the fields of the DESCRIPTION in a "standard" format.

Check of the documentation

The first thing I notice looking at the documentation is a link to another library. I would move this reference below "Usage from the console".

I would also have a short section showing how you can use the library on the README.Rmd via the command line (without prompts). This way the potential user get a grasp of what does the package.

Also probably it would be nice to show the final output of the package, currently it seems like the result is a plain data.frame.

The section in the README about "Using the RStudio addin" I think it worth to be in its own vignette.

There are some man pages (man/cheatsheet.Rd, man/make_labels.Rd) that call library or require on baRcodeR, it is not necessary.

It is not recommended to call the system on a man page like in create_PDF.Rd. If you have to, use system, but system2 would be better because the command is always quoted. So if you want to pass several commands you should call several times to system2.

There is no information about what do the cheatsheet and the make_labels return. It is always good to tell your users what to expect from your functions.

There isn't any example on custom_create_PDF, I would recommend to add one even if they are not run automatically.

Also some examples do not run. From create_PDF:

example_vector <- as.data.frame(c("ao1", "a02", "a03"))
create_PDF(Labels = example_vector, name = file.path(tempdir(), "example"))
# Warning message:
# In custom_create_PDF(user, Labels, name, type, ErrCorr, Fsz, ...) :
#   Cannot find a label column. Using first column as label input.

The details section of create_PDF has also a "#` " at the beginning. The roxygen2 notation is duplicated on the source file, remove it and it will disappear

Check the vignette

The image of the workflow is very informative and useful for the user, but I would move it below the overview (as well as the cheatsheet). As a user I would like to know what does the package first and then how can I do it.

The first code block of the vignette is to install some packages that are (or should be) automatically installed when the user installs baRcodeR. The code block should be removed.

In the vignette you use T and F instead of TRUE and FALSE, T and F are not protected, so it might introduce errors, change them to the full name please.

Next you explain the workflow using prompts. Prompts are quoted, as they are printed in the terminal I would show them as code by using the three backticks formatting.

The next section is again using prompts. I was a bit confused about what is the goal of this section. The columns of this hierarchical labels are the same of the text, which might be confusing. Maybe the title could be one and the string used a different.

In the section "By argument" when I ran the code I got a warning:

# Warning message:
# In uniqID_hier_maker(hierarchy = hier_list, digits = 1) :
#   Digits specified less than max level number. Increasing number of digits for level

It is mentioned again to store it in a csv. I tried to create_PDF and I didn't know where the file was created. Apparently it is created on the working directory. Maybe it should be better explained on the help page. When I used the argument user = TRUE in create_PDF I got a prompt, I introduced the name of the pdf with quotes. Then the resulting file got also quotes. This could be checked and warned to the user as it could create problems.

Check the functionality

The functions are very well commented about what do they do.

Almost all the functions implement a sort of prompt. I am not sure this is actually helpful as then the user need to know which question corresponds with which argument of the function.

The creation of labels seems a bit over-complicated to me. We can obtain all the combinations of a group of variables with expand.grid

labels <- expand.grid(fixed = "a", variable = 1:5, join = "-", 
                      text = letters[2:3], numbers = 7:9)
labels
apply(labels, 1, paste0, collapse = "")

This allows to omit the loop if you have a hierarchy.

The GUI addin is nice, but the "Simple ID Code Generation" and the "Hierarchical ID Code Generation" end up creating a labels.csv instead of creating the bar codes. I expected that they would be directly created. Also there isn't a way to select where the labels.csv is created.

I tried to create a Hierarchical ID Code Generation with the GUI and I end up with an error :

Error: Input list has only one level. Did you forget a level or are you sure you are not looking for uniqIDMaker()?

That's after creating a label and pressing "Add level".

Inspection of the code

Whenever possible avoid sapply(); use vapply(), as sapply will simplify the output which might cause errors. There are 8 cases of this, maybe some of them can be replaced, in some cases you just want to get the first element, maybe you can use unlist instead.

Also you use 1:... like 5 times; use seq_len() or seq_along() or simply seq().

In code_128_make() you used T and F replace them for TRUE and FALSE.

In hidden_createPDF.R you check a class with class(), it is better to use the is() function as it will show all the classes it inherits from.

There are very long functions (make_labels_internals, custom_create_PDF and server that are more than 100 lines) try splitting them in shorter functions. Usually functions of less of 50 lines are recommended.

There are also very long lines, consider shorter lines; 253 lines (15%) are > 80 characters long.

The create_PDF and the custom_create_PDF do the same. I would delete create_PDF. I understand that the idea was to avoid confusing the user with too many arguments. But either do not show them or let them trust you with the defaults, which work reasonable well.

You make use of grDevices:: and other packages but don't declare on the NAMESPACE that you import them. You should do so with #' @importFrom grDevices pdf for instance.

Check the testing

The file uniqID_maker_addin.R is not tested.

I don't know how can you test when you interactively ask the user with prompts but that section should be tested too (hidden_createPDF.R).

Conclusions

Overall the package has its audience it is easy to use and provides a nice functionality for wet-dry researchers.

sckott commented 5 years ago

thanks very much @llrs !

@yihanwu all reviews are now in.

yihanwu commented 5 years ago

Thank you @llrs for the in-depth review!

I'll have a point to point review posted after I make the changes suggested by @llrs and @raynamharris

yihanwu commented 4 years ago

Thank you @raynamharris and @llrs for your detailed reviews and suggestions, which significantly improved the package. Below is a point-by-point response to issues raised, with added commit links where appropriate. Overall, we think we were able to address all of the significant concerns, including new test code to provide additional checks, a re-structured README with content on the add-in GUI moved to a separate vignette. We also added new functionality for making ID codes with ending strings as suggested by @raynamharris. (https://github.com/yihanwu/baRcodeR/commit/f550757a0ea58027d33a862cc980ea1be51da0ea)

llrs review:

Check package integrity

There is a note of non standard files found:

Non-standard files/directories found at top level: ‘CODE_OF_CONDUCT.md’ ‘WuEtAl_2019_baRcodeR_MEE.pdf’ ‘docs’ The WuEtAl_2019_baRcodeR_MEE.pdf seems like an article about the program, which might be better on some preprint archive or another repository but it doesn't belong to the package. I would recommend to delete it from the repository.

  1. The manuscript has been deleted from the repository. (https://github.com/yihanwu/baRcodeR/commit/d0e53b9e1d182dd432e51b8cf4475f6e31f9309e)

CODE_OF_CONDUCT.md and the docs folder should be included in .Rbuildignore. Also the docs folder is created to create the webpage of the package. Maybe you can automate this using use_pkgdown_travis() seeing that you already use Travis.

  1. Updated .Rbuildignore for CODE_Of_CONDUCT.md and docs folder. (https://github.com/yihanwu/baRcodeR/commit/d0e53b9e1d182dd432e51b8cf4475f6e31f9309e)

There are few tests, and 2 warnings related to introducing NAs by coercion (test_warnings.R, lines 12 and 13) and one about "Digits specified less than max number. Increasing number of digits." in test-label_generation.R line 13.

To test for warnings remember that there is the expect_warning function in testthat. The other warning should be captured by expect_warning or select a good number to avoid the warning.

  1. Double warnings in test_warning and test-label_generation.R resolved (https://github.com/yihanwu/baRcodeR/commit/6bef6b993edc2bc55b13182e827296ad1d4831a3)

uniqID_hier_maker is only tested for warnings and not for expected output.

  1. added a test for expected output for uniqID_hier_maker (https://github.com/yihanwu/baRcodeR/commit/6bef6b993edc2bc55b13182e827296ad1d4831a3)

The shiny app is only tested locally. Maybe this was a recommendation from the CRAN? I don't know in which platform it was tested but when I run it with compareImages = TRUE it failed. If possible I would recommend to test it on CRAN even if it is with compareImages = FALSE.

  1. shinytest recommends using compareImages = FALSE because images rendered on different platforms are different. For example, the images generated on Travis (Linux) will be different from mine (Windows). They also advise not to check on CRAN though a reason is not specified.

It reports that the coverage is only 27% I would aim for 90% or more seeing that you want to test also the shiny app and the code.

  1. A lot of the code is dedicated to the interactive command line terminal, and checking for errors in the user responses. Since it is interactive, this code cannot be tested with testthat. As a result, I've added #nocov tags around the command prompt lines to exclude them from testing. There is no integration of shinytest (which I am using to test the shinyapp) and covr at the moment (see issue here) which means the app has 0% coverage. So by excluding the command prompt lines and the shinyapp during coverage testing, I have ~97% coverage of the remaining code on codecov.

It also warns about using T and F instead of TRUE and FALSE.

  1. Changed T and F in vignettes and also in other places (https://github.com/yihanwu/baRcodeR/commit/97fc17f5e71e15e8317f0c86c6523f8b205dcf61#diff-43d20606d79ec280d9646a3925c1cdd1)

Also it recommends to think about the dependency to qrcodes that is on Depends instead of Imports. I think it is sensible to have it Depends as without it this packages seems useless.

  1. Yes, the reason that qrcode is a dependency is that the underlying data matrix for generating the qrcode in the package is unexported. I wanted to avoid copying the data matrix into this package so using Depends is necessary.

However, goodpractie::gp couldn't continue with the analysis due to some error with linter. Probably this is due to some long string of code.

  1. There is a for loop in the function to generate the pdf which goodpractice::gp does not like. The reason this is needed is explained in a comment below . (Response 32)

Check of the metadata

The README doesn't have any badge of status, from repostatus.org. See this section

  1. repostatus badge added (https://github.com/yihanwu/baRcodeR/commit/2979d3382043aa6a1cf913d9e6d737a7818919e9)

Most if not all of the results that the spelling check reports are from examples or names. Consider adding them the dictionary so that they are not reported again.

  1. added WORDLIST in inst/ for spelling. (https://github.com/yihanwu/baRcodeR/commit/2979d3382043aa6a1cf913d9e6d737a7818919e9)

The readme recommends to provide the sessionInfo, you might be interested on the package reprex, which can help your users to create the minimal reproducible example.

  1. I've converted the README.md to be generated from a README.Rmd file. (https://github.com/yihanwu/baRcodeR/commit/2979d3382043aa6a1cf913d9e6d737a7818919e9)

The DESCRIPTION lists 6 Imports and 1 Depends but the NAMESPACE only imports two package. The NAMESPACE should import fully the Depends packages and selectively the functions of packages on Imports.

  1. I'm slightly confused about the NAMESPACE vs DESCRIPTION imports. Right now, I'm listing packages in the DESCRIPTION imports and referring to them with packagename::fun() like the R Packages book suggests here.

It also uses Authors@R and Maintainer it is better to select just one (less things to keep up to date). Nice addition of the ORCID numbers.

  1. If the maintainer field is removed, the maintainer is automatically generated from Authors@R to reference the creator. Right now though, Robert Colautti is the creator, but the person maintaining the project is me (@yihanwu).

I would recommen to use usethis::use_tidy_description to set the fields of the DESCRIPTION in a "standard" format.

  1. The DESCRIPTION file has been regenerated using usethis. (https://github.com/yihanwu/baRcodeR/commit/2979d3382043aa6a1cf913d9e6d737a7818919e9)

Check of the documentation

The first thing I notice looking at the documentation is a link to another library. I would move this reference below "Usage from the console".

  1. This has been done (https://github.com/yihanwu/baRcodeR/commit/d60c7d3d53bb7c44f13ceeabe73f92daa04b7725)

I would also have a short section showing how you can use the library on the README.Rmd via the command line (without prompts). This way the potential user get a grasp of what does the package.

  1. A quick example has been added as the first section in the vignette after the overview. (https://github.com/yihanwu/baRcodeR/commit/d60c7d3d53bb7c44f13ceeabe73f92daa04b7725).

Also probably it would be nice to show the final output of the package, currently it seems like the result is a plain data.frame.

  1. I've included an example page (using default layout) as part of the example from the above line. (https://github.com/yihanwu/baRcodeR/commit/d60c7d3d53bb7c44f13ceeabe73f92daa04b7725)

The section in the README about "Using the RStudio addin" I think it worth to be in its own vignette.

  1. I've made a new vignette and shortened the README length (https://github.com/yihanwu/baRcodeR/commit/92be54710607222419c50b16bfc73bed4bc55f6a)

There are some man pages (man/cheatsheet.Rd, man/make_labels.Rd) that call library or require on baRcodeR, it is not necessary.

  1. I've removed the call to library for cheatsheet but for make_labels, the call is necessary due to the export issue with qrcode explained above. (https://github.com/yihanwu/baRcodeR/commit/8e440710bcd1efb2b0d79d00852a1015f5ac7a60#diff-53e56b637346c7419dd7637bdd7c8c07)

It is not recommended to call the system on a man page like in create_PDF.Rd. If you have to, use system, but system2 would be better because the command is always quoted. So if you want to pass several commands you should call several times to system2.

  1. system has been changed to system2. (https://github.com/yihanwu/baRcodeR/commit/8e440710bcd1efb2b0d79d00852a1015f5ac7a60#diff-53e56b637346c7419dd7637bdd7c8c07)

There is no information about what do the cheatsheet and the make_labels return. It is always good to tell your users what to expect from your functions.

  1. Added return values to documentation (https://github.com/yihanwu/baRcodeR/commit/8e440710bcd1efb2b0d79d00852a1015f5ac7a60#diff-53e56b637346c7419dd7637bdd7c8c07)

There isn't any example on custom_create_PDF, I would recommend to add one even if they are not run automatically.

  1. Copied examples for create_PDF to custom_create_PDF. (https://github.com/yihanwu/baRcodeR/commit/8e440710bcd1efb2b0d79d00852a1015f5ac7a60#diff-53e56b637346c7419dd7637bdd7c8c07)

Also some examples do not run. From create_PDF:

example_vector <- as.data.frame(c("ao1", "a02", "a03")) create_PDF(Labels = example_vector, name = file.path(tempdir(), "example"))

Warning message:

In custom_create_PDF(user, Labels, name, type, ErrCorr, Fsz, ...) :

Cannot find a label column. Using first column as label input.

The details section of create_PDF has also a "#` " at the beginning. The roxygen2 notation is duplicated on the source file, remove it and it will disappear

  1. Fixed non-working example. Fixed extra roxygen. (https://github.com/yihanwu/baRcodeR/commit/8e440710bcd1efb2b0d79d00852a1015f5ac7a60#diff-53e56b637346c7419dd7637bdd7c8c07)

Check the vignette

The image of the workflow is very informative and useful for the user, but I would move it below the overview (as well as the cheatsheet).

  1. Done. (https://github.com/yihanwu/baRcodeR/commit/d60c7d3d53bb7c44f13ceeabe73f92daa04b7725)

As a user I would like to know what does the package first and then how can I do it.

The first code block of the vignette is to install some packages that are (or should be) automatically installed when the user installs baRcodeR. The code block should be removed.

  1. Done. (https://github.com/yihanwu/baRcodeR/commit/d60c7d3d53bb7c44f13ceeabe73f92daa04b7725)

In the vignette you use T and F instead of TRUE and FALSE, T and F are not protected, so it might introduce errors, change them to the full name please.

  1. T's and F's have been replaced in the vignette. See commit on this from above . (Response 7)

Next you explain the workflow using prompts. Prompts are quoted, as they are printed in the terminal I would show them as code by using the three backticks formatting.

  1. I've used backticks to format the prompts as a monospace font to reflect that they are on the command line. In addition, using them as quotes differentiates these lines from commands that should be entered directly on the command lines. (https://github.com/yihanwu/baRcodeR/commit/97fc17f5e71e15e8317f0c86c6523f8b205dcf61#diff-43d20606d79ec280d9646a3925c1cdd1)

The next section is again using prompts. I was a bit confused about what is the goal of this section. The columns of this hierarchical labels are the same of the text, which might be confusing. Maybe the title could be one and the string used a different.

  1. In the vignette, I've added a clarification that the result of following the "user prompt" section and the "by argument" section for each function should end in the same result. (https://github.com/yihanwu/baRcodeR/commit/97fc17f5e71e15e8317f0c86c6523f8b205dcf61#diff-43d20606d79ec280d9646a3925c1cdd1)

In the section "By argument" when I ran the code I got a warning:

Warning message:

In uniqID_hier_maker(hierarchy = hier_list, digits = 1) :

Digits specified less than max level number. Increasing number of digits for level

It is mentioned again to store it in a csv. I tried to create_PDF and I didn't know where the file was created.

Apparently it is created on the working directory.

Maybe it should be better explained on the help page.

  1. I've added a second warning to the return value on the help page and emphasized the file naming with an example as part of the parameter explanation. (https://github.com/yihanwu/baRcodeR/commit/97fc17f5e71e15e8317f0c86c6523f8b205dcf61#diff-43d20606d79ec280d9646a3925c1cdd1)

When I used the argument user = TRUE in create_PDF I got a prompt, I introduced the name of the pdf with quotes. Then the resulting file got also quotes. This could be checked and warned to the user as it could create problems.

  1. I've added a warning on the vignette for users to avoid quotes when typing strings into the interactive prompts. (https://github.com/yihanwu/baRcodeR/commit/97fc17f5e71e15e8317f0c86c6523f8b205dcf61#diff-43d20606d79ec280d9646a3925c1cdd1)

Check the functionality

The functions are very well commented about what do they do.

Almost all the functions implement a sort of prompt.

I am not sure this is actually helpful as then the user need to know which question corresponds with which argument of the function.

The creation of labels seems a bit over-complicated to me.

We can obtain all the combinations of a group of variables with expand.grid

labels <- expand.grid(fixed = "a", variable = 1:5, join = "-", text = letters[2:3], numbers = 7:9) labels apply(labels, 1, paste0, collapse = "") This allows to omit the loop if you have a hierarchy.

  1. The loop is necessary in this case because the level number of the labelling hierarchy is variable in each instance and therefore we cannot define the dimensions of expand.grid beforehand.

The GUI addin is nice, but the "Simple ID Code Generation" and the "Hierarchical ID Code Generation" end up creating a labels.csv instead of creating the bar codes.

I expected that they would be directly created.

  1. There were two reasons that we separate ID Code Generation from PDF Creation. First, creating label.csv is important for reproducibility, especially if the user forgot to save the code snippet that was created or the steps in generating the labels. Saving the csv reinforces a reproducible approach and also provides a file that can archived or exported to other pipelines. Second, if the user has their own custom labels, they can skip the ID Code Generation step and go straight to creating the PDF without having to worry about generating the labels again. This is also important for reproducibility since re-generating the codes from scratch increases opportunities for error.

Also there isn't a way to select where the labels.csv is created.

  1. The labels save to the working directory. See comments below for the commit where I added text notifications to the addin. (Response 49)

I tried to create a Hierarchical ID Code Generation with the GUI and I end up with an error :

Error: Input list has only one level. Did you forget a level or are you sure you are not looking for uniqIDMaker()? That's after creating a label and pressing "Add level".

  1. This was a result of shiny evaluating the input in real-time and converting the warning in uniqID_hier_maker into an error. If there is only one level then the user should use the "Simple ID creation" tab. If another level is added, then the error disappears. However, I can see how this could cause confusion so the prompt "Please add a level" was added and the preview of labels will not show up until the user adds two levels. (https://github.com/yihanwu/baRcodeR/commit/3561ba66267d20fa16a50f9433c9f4a40f9cd8a2)

Inspection of the code

Whenever possible avoid sapply(); use vapply(), as sapply will simplify the output which might cause errors. There are 8 cases of this, maybe some of them can be replaced, in some cases you just want to get the first element, maybe you can use unlist instead.

Also you use 1:... like 5 times; use seq_len() or seq_along() or simply seq().

  1. Those have been replaced in the code with seq(1, ...) (https://github.com/yihanwu/baRcodeR/commit/f6678cf57fdd6963035b3c3b0f4b28795cb7ac19)

In code_128_make() you used T and F replace them for TRUE and FALSE.

  1. Replaced T and F with TRUE and FALSE.

In hidden_createPDF.R you check a class with class(), it is better to use the is() function as it will show all the classes it inherits from.

  1. With the class check, I only want the first "class" value to determine if the input is a vector or data.frame object and having the test fail if it is neither. It's a simple and quick check for incorrect input formats. Having all the classes of an object (as returned by the is.X function) would make it a less efficient test since I would have to check each class of an object against a list.

There are very long functions (make_labels_internals, custom_create_PDF and server that are more than 100 lines) try splitting them in shorter functions. Usually functions of less of 50 lines are recommended.

  1. make_labels_internals is a long function because it contains the user interface. This includes elements for all three tabs as well as the shiny code needed to perform the required processes such as the preview image. custom_create_PDF is also long because more than half of the function is text for interactive user prompts and input checking. While I managed to split the actual generation of the barcodes into their own functions, the layout uses loops to generate a new page in the pdf file.

There are also very long lines, consider shorter lines; 253 lines (15%) are > 80 characters long.

  1. Very long lines have been modified to become shorter lines for the three functions mentioned above (under 120) except for the warning and stop lines where new lines within the messages would break up the message printed for the user (https://github.com/yihanwu/baRcodeR/commit/f6678cf57fdd6963035b3c3b0f4b28795cb7ac19#diff-53e56b637346c7419dd7637bdd7c8c07).

The create_PDF and the custom_create_PDF do the same.

I would delete create_PDF. I understand that the idea was to avoid confusing the user with too many arguments. But either do not show them or let them trust you with the defaults, which work reasonable well.

  1. We understand the logic here but since the package is already published on CRAN (and downloaded > 7,000 times), we think it would cause more confusion to change the function names at this point.

You make use of grDevices:: and other packages but don't declare on the NAMESPACE that you import them. You should do so with #' @importFrom grDevices pdf for instance.

  1. The use of functions not declared in the NAMESPACE has been addressed above . (Response 12)

Check the testing

The file uniqID_maker_addin.R is not tested.

  1. uniqID_maker_addin.R is checked by shinytest, which does not integrate with covr and so it does not show up even though it is tested.

I don't know how can you test when you interactively ask the user with prompts but that section should be tested too (hidden_createPDF.R).

  1. R does not have a framework for testing prompts and at this point, the best option seems to be manual testing of the prompts.

Conclusions

Overall the package has its audience it is easy to use and provides a nice functionality for wet-dry researchers.

raynamharris review

Package Review

baRcodeR is an R package that reproducibly creates sample labels. It can be run from the RStudio console or from a GUI Addin. This is the first time I have ever used a GUI or Addin from within RStudio, and it worked very well after restarting R twice. I would recommend this package to my colleagues who do molecular work. Below is a list of minor comments and concerns.

In the README, there is a "See also:" link to a similar package called zintr, but it is unclear why a user might want to refer to it or why they should chose one over the other or both. A brief elaboration would be helpful.

  1. Added reasoning on why to use baRcodeR rather than zintr. (https://github.com/yihanwu/baRcodeR/commit/74fafac033574fdf5ba142f0580b6caa1c290865)

Also, it would be useful to state that Zint is a barcode generator and library.

  1. Done (https://github.com/yihanwu/baRcodeR/commit/74fafac033574fdf5ba142f0580b6caa1c290865)

I really like the graphical overview and the cheatsheet. These are both very useful. I also like that your cheatsheet is archived in FigShare with a DOI. Does this default layout only work with Uline products or is there an Avery alternative?

  1. A line has been added in the readme to highlight that different layouts can be used. (https://github.com/yihanwu/baRcodeR/commit/74fafac033574fdf5ba142f0580b6caa1c290865).

In order for the baRcodeR GUI to appear in my Addins window, I had to restart RStudio. Perhaps add a note that you made need to restart R after successful installation of the package.

  1. A note has been added to the installation. (https://github.com/yihanwu/baRcodeR/commit/74fafac033574fdf5ba142f0580b6caa1c290865)

The first time I clicked on baRcodeR GUI my R session was aborted. It worked the second time, so maybe add a comment to restart and retry. Also, this is the message that accompanied the first, unsuccessful launch of the GUI and the aborted R session.

baRcodeR:::make_labels() Registered S3 method overwritten by 'R.oo': method from
throw.default R.methodsS3 Loading required package: shiny The Simple ID Code maker works great!

When I click "Create Lable.csv" it appeared like nothing happened, but the file was indeed saved in my working directory. A real bonus would be a pop-up window or a message printed to the console that said something like "file save in your working directory", but a simple fix would be to clarify this in the README.

  1. Added line indicating that file is created in working directory (https://github.com/yihanwu/baRcodeR/commit/113af14284e283a40e5c9d8f83535c2d0236cdfa)

When following the instructions in the "Using the RStudio addin" section, the text explaining the process comes after the example image, much like a figure legend. However, it would be much easier to follow along if the text was above the image instead of below it. Also, there is very little white space between the images and the text. Some level three headers (eg. Simple ID Codes with uniqID_maker and Hierarcical ID codes with uniqID_hier_maker could help distinguish the sections.)

  1. Text descriptions in the vignette are above the image and more headers added (https://github.com/yihanwu/baRcodeR/commit/74fafac033574fdf5ba142f0580b6caa1c290865)

When I click "add level", I keep getting the error message Label Preview Input list has only one level. Did you forget a level or are you sure you are not looking for uniqIDMaker()? I was, however, successfully able to make a hierarchical labels despite the error message. Is there a way to prevent this error message from appearing?

  1. As referred to in the above review, this has been fixed so the error does not appear.

Why is the default font size 2.2 in the GUI? This seems a little small. Also, in the manual, the default size is 5.

  1. This has been changed for both to 12pt (https://github.com/yihanwu/baRcodeR/commit/188d8d2594bc09be7034c72a2a8092cc31ed315a)

I installed the package with devtools::install_github("yihanwu/baRcodeR", build_vignettes = T, dependencies = T) when when I run 1vignette("Using-baRcodeR"), I get the warning message vignette ‘Using-baRcodeR’ not found.

  1. From this issue, apparently, the right command is install_github("yihanwu/baRcodeR", build_opts = c("--no-resave-data", "--no-manual"), build_vignettes = TRUE) for windows. I've added this to the readme. (https://github.com/yihanwu/baRcodeR/commit/74fafac033574fdf5ba142f0580b6caa1c290865)

I was able to past the code from the GUI into the command line and regenerate the same pdf. Success!

  1. Great!

I'm not sure I understand the error correction description. The L,M,Q,H makes me think it goes form low quality to high quality, but this is conflicts with the 7%-30% damage description. What does damage mean? What do LMQ and H stand for? Please clarify.

  1. The correction levels refer to the level of damage a label can sustain and still be able to scan correctly. The levels are Low, Medium, Quantile, and High. So a High label may lose up to 30% of the code but still scan properly while a L label can only tolerate at most losing 7% of the qrcode before it fails to read. As a result, L labels can be printed at smaller sizes compared to H labels. I've added an explanation of the correction levels to the documentation (https://github.com/yihanwu/baRcodeR/commit/e4ab7c94c9ff3af5c6dac4d95b559d2affe87f1d)

I noticed that this repo has been forked from another repo (nine2k/II-I-II-baRcodes-I-II-III)? Do those authors know that this is being submitted as a package? Will they be offered co-authorship?

  1. The original idea and basic code was written by Robert Colautti. nine2k was hired as an intern in his lab and uploaded the code to GitHub before working on the code for the user prompts. The author does know that the code was later modified and turned into an R package. We added her as a contributor. (https://github.com/yihanwu/baRcodeR/commit/5d73169e6c034831aec1884889f40e280ff3dbce)
sckott commented 4 years ago

thanks for your detailed response @yihanwu

@llrs and @raynamharris are you happy with the responses/changes made? any further comments?

llrs commented 4 years ago

Great! Many thanks @yihanwu for answering all the points. The package has really improved with all these commits.

  1. & 2. Great!

  2. I'm not sure that is the solution. Surely this makes that the warnings do not appear but it doesn't solve the issue.

  3. Oh, it makes sense. Thanks for the links.

  4. & 44. Here is an example on how to test use input from the command line.

7 & other points. Great, sorry to be so repetitive.

  1. I'm not sure I understood your response. But given that it works and that it is accepted by CRAN I won't check it further.

9 & 32. I can't say for sure, but it is strange that goodpractice::gp couldn't check a loop. I think this indicates some concerns with the package.

  1. Great!

  2. Ok, the second version of the book, in the Imports section, recommends your method approach too.

  3. The normal approach is to use auth and cre. So if Robert Colautti is the author then auth and you use then cre.

  4. Nice picture, now I get much better an idea of what I get.

  5. In the new vignette there is some configuration code that it is shown.

  6. Thanks

  7. In general it is recommended to avoid loops. An example on what kind of setup would be impossible/hard to do without this loop would convince me.

  8. Yes, I understand your point but I think that if a users needs to use the GUI it can be all done in a single step, both save the labels.csv and create the labels.

  9. In general more concise functions help to maintain the code.

  10. One of the few lines I saw can be further simplified using is.numeric:

any(
    unlist(
      lapply(c(
        numcol, numrow, 
        Fsz, ERows, ECols, 
        trunc, page_width, page_height, 
        height_margin, width_margin, 
        x_space, y_space), class)
      ) != "numeric") == TRUE) 

for

all(vapply(c(
        numcol, numrow, 
        Fsz, ERows, ECols, 
        trunc, page_width, page_height, 
        height_margin, width_margin, 
        x_space, y_space), is.numeric, logical(1L))

Note that you don't need to first check for the class then check if the class is the one you need and then compare if any is true. For this short loops it might not matter but for other things it can help you to speed up the process.

I also think that some warnings can be split in two sentences. One the consequences and another for the advice. For instance, "Linear barcodes created will have bar width smaller than 0.03 inches which may be unreadable by some barcode scanners." can be plitted into "Linear barcodes created will have bar width smaller than 0.03 inches.","\nIncrease the label_width to make them readable."

yihanwu commented 4 years ago

Hi @llrs,

I just wanted to get clarifications on some of the points before I start making the changes. These are numbered based on your comments.

  1. In the example referenced here, the solution is to set a connection and read in a test response for a function with one user prompt. I think it is possible to get this to work with something like this ...
library(testthat)

options(mypkg.connection = stdin())

marryme <- function() {
  cat("will you marry me? (y/n) > ")
  ans <- readLines(con = getOption("mypkg.connection"), n = 1)
  cat("\n")
  cat("will this work? (y/n) >")
  ans1 <- readLines(con = getOption("mypkg.connection"), n = 1)
  cat("\n")
  cat("will this work x2? (y/n) >")
  ans2 <- readLines(con = getOption("mypkg.connection"), n = 1)
  cat("\n")
  cat("will this work x3? (y/n) >")
  ans3 <- readLines(con = getOption("mypkg.connection"), n = 1)
  cat("\n")
  return(ans3=="n")
}

test_that("input", {

  f <- file()
  options(mypkg.connection = f)
 # two sets of four answers
  ans <- paste(c("n", "y", "y", "n", "n", "y", "y", "y"), collapse = "\n") 
  write(ans, f)

  expect_true(marryme()) 
  expect_true(marryme())  

  options(mypkg.connection = stdin())

  close(f)
})

This is enormously helpful if I can get it to work with all the possible test cases.

  1. For the large loop, here it is in basic steps. I've struggled with trying to vectorize this so any suggestions will be greatly appreciated.

I'm not sure how I can open a new page when needed without having the information from the previous loop regarding where the next barcode needs to go.

  1. How about both "Create Labels" and "Create Labels + PDF" buttons so the user can have a choice of either and go to the PDF creation tab if they need customized PDFs.
llrs commented 4 years ago

Hi @yihanwu

  1. Glad it helped. Sorry I can't help you more on this, but I don't have experience on testing input from command line.

  2. The positions on a page are fixed (unless the grid changes in each page, which doesn't seem the case). So, once you know them for a page you can calculate how many pages you need and assign the barcodes to each position and page. If you have each x, y and page position on a data.frame in a single loop you can calculate the image barcode from the labels and use the position to generate the pdf.

This is not needed to be accepted but I think it can be helpful if the users needs to print several hundreds of barcodes (or for future tasks you start).

  1. Yes, that will be helpful. Now that I saw your idea, I realized that my confusion was with the text on the buttons. To me the labels are on the pdf to be printed and used while the buttons considered labels as the matrix with the text.
yihanwu commented 4 years ago

More updates:

  1. I've created tests for user prompts by using several internal functions. (https://github.com/yihanwu/baRcodeR/commit/3abede8d5cfba70c7cf58d24a77a3408ff5257e5). The internal functions allow for several tries at putting in the correct response, and also a choice selection similar to the menu() function (unfortunately menu() did not seem to work for testing)

  2. The loop problem with goodpractice disappeared after I simplified the user prompts.

  3. I've added to the note in the add-in directing the user to head to the barcode creation tab. Unfortunately, when I tried out the extra button to "create labels and PDF", it appeared even more confusing. (https://github.com/yihanwu/baRcodeR/commit/ea73d05d733c227c2d3629f65bccc4f70ad5a6a0)

  4. PDF generation now generates the locations of each label first, and the loop is simpified for plotting with grid. (https://github.com/yihanwu/baRcodeR/commit/fdabc9f95d54abbc9275ac2c8ad1d12347825048)

  5. The description has been changed so the maintainer field is automatically generated. (https://github.com/yihanwu/baRcodeR/commit/fdabc9f95d54abbc9275ac2c8ad1d12347825048)

llrs commented 4 years ago

Great, many thanks

yihanwu commented 4 years ago

Hi @sckott, @llrs , @raynamharris,

Are there any other changes I should review and make?

Cheers, Yihan

llrs commented 4 years ago

Hi @yihanwu On my side I have marked the "Final approval" 19 days ago. I was waiting the editorial decision from @sckott.

sckott commented 4 years ago

Sorry about the delay - doing a final check now

sckott commented 4 years ago

Approved! Thanks @yihanwu for submitting and @llrs and @raynamharris for your reviews!

A few comments:

To-dos:

For MEE:

Should you want to acknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

We've put together an online book with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

yihanwu commented 4 years ago

Hi @sckott,

I've made the changes you have suggested but when I go to transfer the repository, Github tells me I "do not have permission to create public repositories". Is there something else I need to do first?

sckott commented 4 years ago

you should have got an email invitation to a github team within ropensci, did you not get that?

yihanwu commented 4 years ago

Ah, I found it. Completely my fault.

The repo transferred successfully. I've added the docs url pointing to docs.ropensci and removed the automatic deployment from Travis to gh-pages.

I tried to access the generated docs at https://docs.ropensci.org/baRcodeR. Is there a delay until they should be up?

sckott commented 4 years ago

there was a problem with our registry builder, it should be done soonish, within next day for sure

sckott commented 4 years ago

its up now https://docs.ropensci.org/baRcodeR/