ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

Presubmission Inquiry: gp - A Grammar of Plates #539

Closed KaiAragaki closed 2 years ago

KaiAragaki commented 2 years ago

Submitting Author Name: Kai Aragaki Submitting Author Github Handle: !--author1-->@KaiAragaki<!--end-author1-- Repository: https://github.com/KaiAragaki/gp Submission type: Pre-submission Language: en


Package: gp
Title: A Grammar of Plates
Version: 0.1.1
Authors@R: 
    person(given = "Kai",
           family = "Aragaki",
           role = c("aut", "cre"),
           email = "aaragak1@jhmi.edu",
           comment = c(ORCID = "0000-0002-9458-0426"))
Description: `gp` attempts to provide a succinct yet powerful grammar to describe common microwell layouts to aide in both plotting and tidying.
License: GPL (>= 3)
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2
Imports: 
    cli,
    dplyr,
    ggplot2,
    glue,
    methods,
    purrr,
    rlang,
    stats,
    tibble,
    tidyr
Suggests: 
    rmarkdown,
    knitr,
    testthat (>= 3.0.0)
VignetteBuilder: knitr
Depends: 
    R (>= 4.1.0)
URL: https://kaiaragaki.github.io/gp/
Config/testthat/edition: 3

Scope

The package is largely involved in converting plate-format data into tidy data (similar but not identical to the excellent plater - see below)

NA

Both bench scientists who generate plate data, developers who ingest plate data, and people who need to illustrate plate layouts for illustration purposes (likely for protocols or apps).

The wonderful plater package accomplishes a similar goal, but I don't believe it has the same scope that this package does. The similarity between the two packages is the main reason why I have submitted this inquiry.

plater is a wonderful interface for tidying microwell data by laying out experimental design in a spreadsheet-like manner. gp does this, but by writing code instead of using a spreadsheet, reducing the amount of undocumented steps. It also provides handy tools for plotting plate layouts - indeed, plotting and tidying go a bit hand-in-hand in gp. Two vignettes are on the pkgdown website that demonstrate its flexibility and how it is used to tidy data

NA

My primary concern is if this package is too similar to the plater package to bar it from receiving further review. Regardless, thank you very much for your time!

emilyriederer commented 2 years ago

Hi @KaiAragaki - Thank you so much for submitting this inquiry. We especially appreciate your detailed comparison to the existing plater package and the comparison to gp. I was wondering if you could please elaborate on just one point there:

Could you please comment on any interoperability that you considered between gp and plater? Are there ways these packages can work together to make use of any existing plater functionality?

For example (and only for example! I have not used plater so these are hypotheticals and not suggestions), Could your code-based process for reading data create the same type of object as plater if desired? Is it possible to cast between the types of structures gp and plater create if desired? Could the gp visualization functions be extended such that they could also work on plater objects?

I understand your point that the tidying/plotting are tightly coupled in some ways, but would just appreciate your comment on this. Thank you!

KaiAragaki commented 2 years ago

Hi @emilyriederer! Thank you for taking the time to review my presubmission.

While there currently isn't any built in interoperability, that would be a fairly simple addition (At least, by my naive prediction).

plater does not create specific objects, but ingests .csv files formatted in a particular way to produce tidy, annotated tibbles. gp does not read in files, but takes data.frame-like objects and tidies and annotates them. I've included a true masterpiece of an ascii illustration below to show you the current conversions that are done (solid-ish lines) and those that could be introduced to increase interoperability (dashed lines)

                plater
annotated .csv --------> annotated tidy tibble <- - - - - - - - - - - - - - - - - - ,
       ^- - - - - - - - - - - - - - - - - - - - - - - - - - - - ,                   :
                                                                :                   :
                   base, readr, readxl...                gp     v     gp            v
.csv/.tsv/.xls... -----------------------> data.frame --------> gp <------> annotated tidy tibble
                                      tidy data.frame ----------^

The tidying/plotting coupling comes from the way that gp objects are 'built up', much like a SQL query is built. Plotting gives you an idea of what each additional layer of annotation you add looks like. Once you've built up all the levels of annotation for the plate, you switch the pipe output from gp_plot to gp_serve. Because of this, by plotting the data you tidy it at the same time. I think this vignette helps explain it in pictures a bit better than I did here (and thankfully they are not my ascii pictures), but that's the gist of it.

Did this answer your questions? Sorry if this confused things further.

Thank you again for your time!

emilyriederer commented 2 years ago

Hi @KaiAragaki - thank you for the additional information! I really appreciate the thoughtful reply. I am pulling in some editors from our team with more experience in this domain to discuss and will follow up soon.

emilyriederer commented 2 years ago

Hi @KaiAragaki - thank you for your patience. After discussing with the editorial board, we have judged your package to be in-scope and would like to invite you to make a full submission. I'll close this presubmission issue, and please go ahead and open a new issue with the full submission.

Some notes of interest from our discussion:

Thank you and we look forward to seeing your submission!