Closed editorialbot closed 3 months ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1210/endocr/bqaa114 is OK
- 10.1371/journal.pone.0012418 is OK
MISSING DOIs
- None
INVALID DOIs
- None
Software report:
github.com/AlDanial/cloc v 1.90 T=0.01 s (1908.7 files/s, 85596.5 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
R 20 65 400 374
Rmd 1 30 79 89
Markdown 3 26 0 52
TeX 1 1 0 27
YAML 1 1 4 18
-------------------------------------------------------------------------------
SUM: 26 123 483 560
-------------------------------------------------------------------------------
Commit count by author:
40 sportiellomike
17 Mike Sportiello
Paper file info:
π Wordcount for paper.md
is 738
β
The paper includes a Statement of need
section
License info:
β
License found: MIT License
(Valid open source OSI approved license)
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
ππΌ @sportiellomike, @yjunechoe, @jsun - this is the review thread for the submission. All of our communications will happen here from now on.
As a reviewer, the first step is to create a checklist for your review by entering
@editorialbot generate my checklist
as the top of a new comment in this thread. These checklists contain the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. The first comment in this thread also contains links to the JOSS reviewer guidelines.
The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues directly in the software repository. If you do so, please mention this thread so that a link is created (and I can keep an eye on what is happening). Please also feel free to comment and ask questions in this thread. It is often easier to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.
We aim for reviews to be completed within about 2-4 weeks. Please let me know if any of you require some more time. We can also use EditorialBot (our bot) to set automatic reminders if you know you'll be away for a known period of time.
Please feel free to ping me (@csoneson) if you have any questions or concerns. Thanks!
ππ» Just wanted to check in to see that things were going well here and that there are no questions about the process (@jsun - I see that you haven't yet created your checklist, let me know if you have any questions!) Thanks all!
Hi @csoneson - I'm returning to this review after my first pass at the checklist (I have been awaiting a response from the author about automated tests before I proceed further)
I took some time to read the contents of this package more closely, and I have some serious concerns about the Substantial scholarly effort criterion.
Inside the package I count 12 functions - a combined LOC of 273 lines and an average of 23 lines for each function (LOC range: 4-45). Looking at internal function calls as a proxy for complexity, each function in the package is composed of an average 13 calls to other functions, mostly c()
, return()
, and print()
. See below for the code involved in this analysis.
I'm afraid there's not much for me to work with here to evaluate the package. As per the review guidelines, I would like to flag the submission as questionable scope before proceeding further.
Hi @sportiellomike - thanks for the quick response, and I apologize for not being clear on the issue I opened on automated tests
To your points - I understand your perspective on the usefulness of the tool (albeit I've not fully evaluated this formally). I only mean to comment on the structure and contents of the package as an open source software developed for general consumption, according to the scope and standards of JOSS specifically. To clarify, I'm following the practice of previous reviewers who have raised issues of LOC and complexity as part of evaluating the substantial scholarly effort criterion (see similar issues).
I'm happy to complete the review and help with the review process up to a successful publication, but substantial scholarly effort is a rather tricky matter to resolve, especially if the package is already considered to be at a feature-complete state that addresses its goals. I only seek clarification from the editorial team - I apologize again for the interruption.
@yjunechoe Thanks for voicing your concern. Let me bring this up with the broader editor team and get back to you.
@editorialbot query scope
Submission flagged for editorial review.
Sorry for the delay. Ill finish my review for next week.
@csoneson: the reviewer @yjunechoe encouraged me to comment here regarding the "substantial scholarly effort" guideline in the other thread.
While I understand it to be the case JOSS wants longer submissions, I wanted to be clear that this project was created for the purpose of solving specific problems: the unnecessary death of experimental animals by requiring fewer crosses and therefore animals used, as well as increasing the speed of which experimentalists can obtain their desired animal of correct genotype. As a result, i believe this submission to be a substantial contribution as it has achieved those goals. It is true it is a smaller package, but it still is a large contribution to the field of experimental animal morbidity/mortality reduction, of which there is minimal resources, and no software resources that are free and open source to my knowledge. I am glad I was able to achieve these goals in a moderate number of lines of code instead of a large number as it is more efficient, though I freely admit this package is smaller than most published in JOSS. I hope you agree.
This package contains simple but important features to assist wet researchers. Therefore, I think, the main target users of this package would be wet researchers who are unfamiliar with coding programmes. Therefore, I recommend the author to create easy-to-use functions and clear documentation (vignette). The following comments are based on a check of this contribution.
package
summarizepotentialpups
to summarize_potential_pups
or summarizePotentialPups
). Alternatively, please consider using the S4 class to develop this package. In this case, instead of using summarizepotentialpups
, a simple summary
function can be defined.plot
function to reduce end-user costs.canwegetalltheallelesfromonecross
and whichpairsshouldibreed
functions. These outputs are not R-like.vignette
plot<-ggplot(subsetmelt)+geom_col(aes(y=percent,x=genotype,fill=genotype))+facet_wrap(~gene)+
kind of code. The plot
is the famous generic function, please consider using other variable name to assign the ggplot object.whichpairstobreed<-whichpairsshouldibreed(x=pointsperpupoutput,desiredvector = desiredvec)
; there is no spaces between =
and other contains the spaces.compilegametesoutput
to compilegametes_output
)@csoneson: the reviewer @yjunechoe encouraged me to comment here regarding the "substantial scholarly effort" guideline in the other thread.
Thanks @sportiellomike - the editorial team is discussing the submission and we should have an answer for you soon! Thanks for your patience.
Thank you for your patience, and thanks again @yjunechoe for raising the point about the scope. The editorial team has carefully reviewed the submission and the discussion, and we felt that although it is close to borderline, we would consider it in scope, and it seems to provide useful functionality for the target audience. Thus, I would encourage the reviewers to continue their evaluation of the submission. Thanks!
@csoneson Thanks for the clarification!
@sportiellomike Apologies again for the interruption. I'll complete the rest of my review swiftly.
I expect some time with back-and-forths on the package repository to get the submission into a good shape, and I'll keep raising specific points as GH issues, but in the meantime here are my general thoughts on the package:
First off I agree with all points that the other reviewer has mentioned above. I'd especially like to echo the point about function/variable names being difficult to parse, the general inconsistencies in coding style (use of spaces and indentation, use of semicolon, etc.), and the fact that the return value of some functions are not R-like. My one point of digression from the other reviewer is that for a package of this size, I don't think a S4 class is necessary, but I do agree that some kind of a custom print method is appropriate (perhaps the lighter-weight S3).
And two big-picture comments/questions of my own:
There are several cases of non-idiomatic R code in the package. The other reviewer has already given an example of one: the canwegetalltheallelesfromonecross()
function returns a string that's either "notonecrossforonecopy"
or "onecrossforonecopy"
, which is very difficult to program over. It'd be more appropriate to have the function be called something like is_onecross()
that returns TRUE
or FALSE
.
A few other examples to this larger point about non-idiomatic R code:
Inside fertilize()
, I see:
fertilizeoutput<<-listofzygotes
With no local binding for the variable fertilizeoutput
. This inevitably assigns into the user environment - such leaky behaviors (and side-effect functions in general) are not R-like and highly discouraged for a function in a package. This also happens in spermandeggs()
Also inside fertilize()
and elsewhere, I see:
which(spermcolnumerics == T)
This should simply be which(spermcolnumerics)
- equality checks to a scalar boolean are redundant and can introduce silent type coercion bugs, since ==
is loose equality and R doesn't have vectorized strict equality. This comment also applies to use of isTRUE()
in a few if
statements elsewhere.
Additionally, in addpoints()
and elsewhere, I see:
pointsvector<-c()
for (k in 1:length(desiredvector)) {
But growing a vector is costly in R. The appropriate pattern is to either pre-allocate the vector and fill it with values, or *apply()
over the iterator.
Lastly to keep this short, in pointsperpup()
and elsewhere, I see:
genecolx[genecolx=='homopos']<-2
genecolx[genecolx=='het']<-1
genecolx[genecolx=='homoneg']<-0
Such incremental replacement of values are odd when the type doesn't match. The idiomatic R approach would be vectorized subsetting of a named vector, like:
genotypes <- setNames(0:2, c("homoneg", "het", "homopos"))
genotypes[genecolx]
I want to note that the above code chunk of applying number coding to genotypes has been copy-pasted across several functions in the package - this is undesirable for maintenance. To this point, I also notice that the package has zero internal utility functions (all functions are exported), which is uncharacteristic of a package.
These are perhaps minor points and there's quite some way to go on this front, but I would like to see these issues resolved for a publication-quality software
I'm a little puzzled at the level of R experience that is expected of the average user of this tool. Currently, I sense bit of a tension in what is assumed of the audience: the audience is assumed to be familiar enough with R and coding to be able to build/parse the vignettes and understand the package from reading and running the examples themselves, but not experienced enough such that the package exports code like this that simply multiplies the length of a vector by 100, obscured under a function.
possiblepointsfromdesiredoutcome<-function(desiredvector = desiredvec){
possiblepoints<-length(desiredvector)*100
return(possiblepoints)
}
Overall, there is not enough hand-holding for a novice R user, but enough non-idiomatic R quirks to confuse a more experienced R user. It would be beneficial for design of the package (and also for the paper) to clearly state the level of R knowledge assumed/required of the user of this package, and cater specifically to that audience (and broaden the audience once the package meets the need of its core users).
I get the sense that it's the former (package is for novice programmers), and I actually wonder in that case if something more like a point-and-click interface is more appropriate (like a shiny app complement to the software). I encourage the author to at least look into this somewhat more obvious method of supporting novice users and comment on the choice to stick to coding in R as the primary interface to this tool.
I am working on these issues and will tag you when resolved. As you mentioned, this will take some time. I appreciate your patience. I'd like to thank the reviewers for their thorough comments, which I know will make this software better and more usable.
Thank you again.
Hi @sportiellomike - could you give us a short update on how things are progressing? Just want to make sure that submissions are actively worked on and not stalled. Let us know if you have questions about any of the comments. Thanks!
Thanks for checking in! Yes I am working and have plans to give you an update by one week from today having finished the edits (or at least a progress update with a timeline).
I appreciate the editors and reviewers in this process. Thank you.
Thank you to all involved for all your help. I'll get to it.
@jsun's comments
I have changed all the names as requested. I agree this is much better now (more readable, more sensible, more in line with standards). Thank you.
Visualization I actually removed before submission, and only added it back in in case reviewers asked for it. I really don't think it adds much to the project.
I changed the output formats as requested and have altered the functions slightly so they only output desired text. While they still out put text to the user, I can't think of another way that the user could get this information. To me, the best output is direct text informing user of the advice of the software, which is the primary function of the software.
Totally agree that I needed to explain how to obtain vignettes. This has been completed.
I have added abstract as well
I have changed the size of the figures.
I have changed the "plot" to a different variable name as requested
I have fixed spacing
I have changed readability of output variables as requested
Thank you so much for all your help
@yjunechoe
I have corrected the canwegetalltheallelesfromonecross()
function using the invisible function appropriately as suggested in by other users that had similar situations.
I have removed the unnecessary <<- throughout the package. However, given the linear nature of this software, I elected to keep the sperm and egg one instead of assigning the two outputs for male and female to user supplied variables as I think there is a high likelihood of confusion by the user of which is egg and which is sperm. In order to make that foolproof, I kept the <<- (but removed the other case of it as I totally agree with you it was non-optimal)
With regards to which(spermcolnumerics == T)
, when I correct it to which(spermcolnumerics)
, the whole thing breaks and the vignette can no longer build. I have changed it back to the original as a result.
I'm confused why my assigning of 0, 1, and 2 is non-idiomatic. The function works appropriately and I think is more understandable to the user of what I'm doing (assigning the number of alleles directly over the text based ("homoneg", "het", "homopos").
The function is designed for a step above novice, let's say intermediate. I have attempted to address some of the difficulties of user experience in the readme and vignette as suggested by both of you reviewers.
Thank you again for both reviewers and the editing team for your knowledge, advice, and assistance through this process.
And just to be explicit as I did not mention this above @jsun @yjunechoe: I am making a shiny app as you requested as well. Please let me know where to store this (within this same github?).
@sportiellomike Thank you for the updates, here and elsewhere, and the hard work that went into them. I'm satisfied with your edits and/or justifications to the points I raised in my previous comment. Should they raise a separate problem down the line I will raise them again, but we can move forward on the other points (the ones I've opened as issues on the package repo).
I'm also happy to hear the addition of a shiny app - this will make the submission much stronger. The app should live inside /inst/<APP NAME>/
, and a function should be added/exported to allow users to run the app through a function call. Please feel free to ping me on your repo in a new issue for the shiny app, once you have a minimally working app and have put it under /inst
@yjunechoe shiny app is live. if you choose to try it, I put another example datasets up there to test it out that are less complex than original (edit_this_CSV_with_breeder_mice2.csv).
I included "visualization" of uploaded csv for user to check to make sure it looks right, "visualization" of desired genotype that the user can manually type in to make sure it looks correct, the advice the program gives to the user, a couple relevant tables for potential pups/summary per gene and cross, and a download button for the user to download potential pup table.
Let me know if I misunderstood your instructions on where to put it.
@sportiellomike thank you for considering my comments. Here is my other suggestion (just suggestions), you don't need to change your package as in this suggestion.
In the case of the can_we_get_all_the_alleles_from_one_cross
function, the function name is "can we get ... ", like a question , so it is just Yes or No as an answer. Therefore, I think TRUE
or FALSE
would be appropriate as the return value of this function. Or, change the function name to get_all_alleles
. This function can return the number of crossing experiments needed to get all alleles as an integer.
For which_pairs_should_i_breed
function, the return may organized as a list or slots, for example:
which_pairs_should_i_breed <- function() {
...
output <- list(recommend_1 = NULL, recommend_2 = NULL, prob=NA, all_alleles_with_one_corss=NA)
output$recommend_1 <- data.frame(
mom = c('mouse1', 'mouse1', 'mouse1', 'mouse1', 'mouse2', ...),
dad = c('mouse5', 'mouse6', 'mouse7', 'mouse8', 'mouse5', ...)
)
output$recommend_2 <- data.frame(
mom = c('mouse1', 'mouse1', 'mouse1', 'mouse1', 'mouse2', ...),
dad = c('mouse5', 'mouse6', 'mouse7', 'mouse8', 'mouse5', ...)
)
output$prob <- 0.25
output$all_alleles_with_one_corss <- FALSE
output
}
For some outputs, I can only see momdad
column in the data.frame. If it possible, use mom
and dad
columns instead of using momdad
column (of course, you can still remain the momdad
column). Adding mom
and dad
columns would facilitate secondary use of the outputs.
@jsun I fixed the mom and dad columns in the shiny as you recommended. Thanks!
I elected to defer your suggestion for the output lists not because I think it wouldn't work, just because several other functions depend on the output and would involve changing a long list of functions in the package. I will take the themes of your suggestion into account if I develop a version 2.0 and during future package development.
@jsun @yjunechoe
I believe I have addressed all of your necessary proposed changes, considered and changed package in light of most of your optional suggestions, and built a shiny for use as well. Please let me know if there is anything else I can do for you. Thank you again for reviewing this software and thank you @csoneson for editing.
Thanks, I see the shiny app. Could you please also create a vignette of instructions for how to use the shiny app? This can be a video walkthrough or a document of screenshots that a novice user with little programming experience (the main audience for the shiny app) can follow along.
I will also just note that in so far as app.R
already exists in /inst
, there's no need to copy paste the app code into shiny_mousebreedeR()
. That user-facing function can just be:
shiny_mousebreedeR <- function() {
appfile <- system.file("app.R", package = "mousebreedeR")
shiny::runApp(appfile)
}
Please let me know once instructions for using the shiny app are available so I can go ahead and test its features.
hi, @csoneson , the checklist is down.
Thank you @jsun!
Thank you @jsun. @yjunechoe all set with your requests as well.
Thanks - the package seems to be at a state where it meets all the minimum criteria for the checklist items. @csoneson all items have been checked off.
@sportiellomike In the meantime, please continue working to improve the results of devtools::check()
, ideally to a point where no errors and warnings are detected. I personally haven't found any of them to be critical for installation on my end, but errors and warnings are generally indicative of a package's fragility across different user environments (for example, one such error suggests that you assume users to have {DT}
installed due to the shiny app having a dependency on it - this should be declared in Suggests and dependency requirement should be checked accordingly when the shiny app is called). I recommend consulting the Writing R Extensions manual and the R Packages book as you work through this.
@jsun, @yjunechoe - thank you very much for your thorough reviews of this package!
@sportiellomike - as the reviewers have finished their checklists and are satisfied with the submission, I will also take a quick look through it - I'll get back to you shortly with the next steps.
@editorialbot generate pdf
@editorialbot check references
@editorialbot check repository
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1210/endocr/bqaa114 is OK
- 10.1371/journal.pone.0012418 is OK
MISSING DOIs
- None
INVALID DOIs
- None
Software report:
github.com/AlDanial/cloc v 1.90 T=0.02 s (2050.8 files/s, 121479.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
R 22 146 673 555
Markdown 5 84 0 215
Rmd 1 35 88 88
YAML 2 12 6 57
CSV 3 0 0 27
TeX 1 1 0 27
-------------------------------------------------------------------------------
SUM: 34 278 767 969
-------------------------------------------------------------------------------
Commit count by author:
90 sportiellomike
23 Mike Sportiello
Paper file info:
π Wordcount for paper.md
is 738
β
The paper includes a Statement of need
section
License info:
β
License found: MIT License
(Valid open source OSI approved license)
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@sportiellomike I have looked through the submission and opened a PR with some suggestions for the paper. Please have a look and merge if you agree. In addition, could you add a couple of sentences to make the Summary
section a bit more accessible for a non-specialist audience (who may not know what a 'floxed' gene is, for example)? Once this is done, you can as the editorial bot to generate a new proof by typing @editorialbot generate pdf
in a post.
@editorialbot generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
Submitting author: !--author-handle-->@sportiellomike<!--end-author-handle-- (Mike Sportiello) Repository: https://github.com/sportiellomike/mousebreedeR Branch with paper.md (empty if default branch): Version: v1.0.0 Editor: !--editor-->@csoneson<!--end-editor-- Reviewers: @yjunechoe, @jsun Archive: 10.5281/zenodo.11405147
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@yjunechoe & @jsun, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @csoneson know.
β¨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest β¨
Checklists
π Checklist for @yjunechoe
π Checklist for @jsun