Closed KaiAragaki closed 2 years ago
Hi @KaiAragaki - Thank you so much for submitting this inquiry. We especially appreciate your detailed comparison to the existing plater
package and the comparison to gp
. I was wondering if you could please elaborate on just one point there:
Could you please comment on any interoperability that you considered between gp
and plater
? Are there ways these packages can work together to make use of any existing plater
functionality?
For example (and only for example! I have not used plater
so these are hypotheticals and not suggestions), Could your code-based process for reading data create the same type of object as plater
if desired? Is it possible to cast between the types of structures gp
and plater
create if desired? Could the gp
visualization functions be extended such that they could also work on plater
objects?
I understand your point that the tidying/plotting are tightly coupled in some ways, but would just appreciate your comment on this. Thank you!
Hi @emilyriederer! Thank you for taking the time to review my presubmission.
While there currently isn't any built in interoperability, that would be a fairly simple addition (At least, by my naive prediction).
plater
does not create specific objects, but ingests .csv files formatted in a particular way to produce tidy, annotated tibble
s. gp
does not read in files, but takes data.frame
-like objects and tidies and annotates them. I've included a true masterpiece of an ascii illustration below to show you the current conversions that are done (solid-ish lines) and those that could be introduced to increase interoperability (dashed lines)
plater
annotated .csv --------> annotated tidy tibble <- - - - - - - - - - - - - - - - - - ,
^- - - - - - - - - - - - - - - - - - - - - - - - - - - - , :
: :
base, readr, readxl... gp v gp v
.csv/.tsv/.xls... -----------------------> data.frame --------> gp <------> annotated tidy tibble
tidy data.frame ----------^
The tidying/plotting coupling comes from the way that gp
objects are 'built up', much like a SQL query is built. Plotting gives you an idea of what each additional layer of annotation you add looks like. Once you've built up all the levels of annotation for the plate, you switch the pipe output from gp_plot
to gp_serve
. Because of this, by plotting the data you tidy it at the same time. I think this vignette helps explain it in pictures a bit better than I did here (and thankfully they are not my ascii pictures), but that's the gist of it.
Did this answer your questions? Sorry if this confused things further.
Thank you again for your time!
Hi @KaiAragaki - thank you for the additional information! I really appreciate the thoughtful reply. I am pulling in some editors from our team with more experience in this domain to discuss and will follow up soon.
Hi @KaiAragaki - thank you for your patience. After discussing with the editorial board, we have judged your package to be in-scope and would like to invite you to make a full submission. I'll close this presubmission issue, and please go ahead and open a new issue with the full submission.
Some notes of interest from our discussion:
field and laboratory reproducibility tools
category (which we realize is missing from the template)Thank you and we look forward to seeing your submission!
Submitting Author Name: Kai Aragaki Submitting Author Github Handle: !--author1-->@KaiAragaki<!--end-author1-- Repository: https://github.com/KaiAragaki/gp Submission type: Pre-submission Language: en
Scope
Please indicate which category or categories from our package fit policies or statistical package categories this package falls under. (Please check an appropriate box below):
Data Lifecycle Packages
[ ] data retrieval
[ ] data extraction
[ ] database access
[x] data munging
[ ] data deposition
[ ] data validation and testing
[ ] workflow automation
[ ] version control
[ ] scientific software wrappers
[ ] database software bindings
[ ] geospatial data
[ ] text data
Statistical Packages
[ ] Bayesian and Monte Carlo Routines
[ ] Dimensionality Reduction, Clustering, and Unsupervised Learning
[ ] Machine Learning
[ ] Regression and Supervised Learning
[ ] Exploratory Data Analysis (EDA) and Summary Statistics
[ ] Spatial Analyses
[ ] Time Series Analyses
Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
The package is largely involved in converting plate-format data into tidy data (similar but not identical to the excellent
plater
- see below)NA
Both bench scientists who generate plate data, developers who ingest plate data, and people who need to illustrate plate layouts for illustration purposes (likely for protocols or apps).
The wonderful
plater
package accomplishes a similar goal, but I don't believe it has the same scope that this package does. The similarity between the two packages is the main reason why I have submitted this inquiry.plater
is a wonderful interface for tidying microwell data by laying out experimental design in a spreadsheet-like manner.gp
does this, but by writing code instead of using a spreadsheet, reducing the amount of undocumented steps. It also provides handy tools for plotting plate layouts - indeed, plotting and tidying go a bit hand-in-hand ingp
. Two vignettes are on the pkgdown website that demonstrate its flexibility and how it is used to tidy dataNA
My primary concern is if this package is too similar to the
plater
package to bar it from receiving further review. Regardless, thank you very much for your time!