ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

Submission: CytoRSuite - Flow Cytometry Analysis Toolkit #281

Closed DillonHammill closed 2 years ago

DillonHammill commented 5 years ago

Submitting Author Name: Dillon Hammill Submitting Author Github Handle: !--author1-->@DillonHammill<!--end-author1-- Repository: https://github.com/DillonHammill/CytoRSuite Version submitted: 0.9.9 Editor: !--editor-->@geanders<!--end-editor-- Reviewers: @jacobpwagner, @lmweber

Due date for @jacobpwagner: 2019-04-22 Due date for @lmweber: 2019-04-22

Archive: TBD
Version accepted: TBD


Package: CytoRSuite
Type: Package
Title: Compensation, Gating & Visualisation Toolkit for Analysis of Flow Cytometry Data
Version: 0.9.9
Authors@R: c(
  person("Dillon", "Hammill", role = c("aut", "cre"), email = "Dillon.Hammill@anu.edu.au"),
  person("Greg", "Finak", role = "ctb", email = "gfinak@fredhutch.org"),
  person("Mike", "Jiang", role = "ctb", email = "wjiang2@fhcrc.org"))
Description: An intuitive and interactive approach to flow cytometry analysis in R. 
  CytoRSuite contains tools for every stage of the analysis pipeline including 
  user-guided automatic compensation, realtime editing of spillover matrices, 
  manual gate drawing, gate editing, visualisation of all existing flow cytometry 
  objects and export of population level statistics for further analysis. Furthermore,
  through linking to openCyto, CytoRSuite provides the tools necessary to easily combine 
  manual and automated gating approaches in R.
URL: https://github.com/DillonHammill/CytoRSuite
BugReports: https://github.com/DillonHammill/CytoRSuite/issues
Depends: 
  R (>= 2.10),
  openCyto(>= 1.21.1),
  flowWorkspace (>= 3.28.2),
  flowCore (>= 1.49.4)
Imports: 
  ncdfFlow,
  data.table,
  BiocGenerics (>= 0.29),
  shiny, 
  shinythemes, 
  rhandsontable, 
  MASS,
  methods,
  stats,
  utils,
  graphics,
  grDevices,
  tools,
  magrittr
Suggests: 
  CytoRSuiteData,
  RProtoBufLib,
  cytolib,
  knitr,
  rmarkdown,
  testthat,
  vdiffr,
  shinytest,
  covr,
  robustbase,
  pkgdown
Remotes: 
  Bioconductor/BiocGenerics@a2de4dd,
  RGLab/RProtoBufLib@trunk,
  RGLab/flowCore@trunk,
  RGLab/ncdfFlow@trunk,
  RGLab/cytolib@trunk,
  RGLab/flowWorkspace@trunk,
  RGLab/openCyto@trunk,
  DillonHammill/CytoRSuiteData@master
License: GPL-2
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.1
VignetteBuilder: knitr

Scope

CytoRSuite uses the existing flow cytometry infrastructure (flowCore and flowWorkspace packages) to import flow cytometry data. CytoRSuite (which is built on using the openCyto package) provides the toolkit necessary to interactively manipulate, process and visualise this flow cytometry data.

CytoRSuite is primarily targeted to scientists who traditionally utilise software with a GUI to manually gate their flow cytometry data. CytoRSuite aims to form a bridge between this GUI software and R by providing a similar user experience with minimal coding. Furthermore, CytoRSuite aims to provide a means by which manual gating users can be exposed to and eased into automated gating approaches.

Although other R packages which support automated gating approaches exist, to the best of my knowledge there is no R package which supports manual gating. CytoRSuite not only brings the ability to manually draw gates, but also the potential to combine automated and manual gating approaches through the openCyto gatingTemplate. Furthermore, CytoRSuite expands the existing compensation tools by providing an interactive shiny application to fine tune spillover matrix values in realtime, Additionally, CytoRSuite contains an intuitive set of plotting functions which can provide powerful visualisations with minimal coding. CytoRSuite integrates all the tools necessary for the analysis of flow cytometry data in a single cohesive package.

Technical checks

Confirm each of the following by checking the box. This package:

Publication options

JOSS Options - [ ] The package has an **obvious research application** according to [JOSS's definition](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). - [ ] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: - (*Do not submit your package separately to JOSS*)
MEE Options - [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)

Code of conduct

sckott commented 5 years ago

thanks for your submission @DillonHammill - @geanders has been assigned as handling editor

geanders commented 5 years ago

Editor checks:


Editor comments

@DillonHammill : My apologies for the delays in getting this out for review! It looks like a very interesting package, and I'm very excited to be serving as an editor. Please let me know throughout the process if you have any questions. Also, I am much less familiar with Bioconductor than I am with the CRAN ecosystem, so I may have some misunderstandings along the way. Please let me know if there's something I've misunderstood, especially related to Bioconductor, for this package.

In the next few days, I will be assigning reviewers for the package. In the meantime, here are a few small issues that preliminary checks brought up that you could address in the package:

Some tips for reviewers

The README for the package's GitHub repository has some very nice instructions for required BioConductor and GitHub dependencies. As a note to reviewers, make sure you read through these, and you can use the code examples in the README to install those dependencies, especially if you clone the CytoRSuite package code and build locally rather than using devtools::install_github to get it and its dependencies.

Also, it seems that on MacOS, users might need to install XQuartz to allow the vdiffr package to load correctly. I got the same error message described here when trying to install the package. This was resolved once I installed XQuartz on my computer and rebooted.

spelling::spell_check_package() output

indicies                    gate_draw-flowSet-method.Rd:15
                                 gate_draw-GatingSet-method.Rd:23
                                 gate_edit.Rd:14
recommmend          spillover_compute-flowSet-method.Rd:16
suppply                    CytoRSuite.Rmd:254

goodpractice::gp() output

The following issues came up for goodpractice checks:

It is good practice to

✖ write unit tests for all functions, and all package code in
    general. 70% of code lines are covered by test cases.
It is good practice to

  ✖ write short and simple functions. These functions have high
    cyclomatic complexity:gate_web_draw (113).
It is good practice to

  ✖ fix this R CMD check ERROR: Package suggested but not
    available: ‘pkgdown’ The suggested packages are required for a complete
    check. Checking can be attempted without them by setting the environment
    variable _R_CHECK_FORCE_SUGGESTS_ to a false value. See section ‘The
    DESCRIPTION file’ in the ‘Writing R Extensions’ manual.

I am not completely sure why this error is coming up, because pkgdown is available on CRAN. It would be great to resolve this if we can.

It is good practice to

  ✖ not use "Depends" in DESCRIPTION, as it can cause name clashes,
    and poor interaction with other packages. Use "Imports" instead.
It is good practice to
  ✖ not import packages as a whole, as this can cause name clashes
    between the imported packages. Instead, import only the specific
    functions you need.
It is good practice to
  ✖ avoid 'T' and 'F', as they are just variables which are set to
    the logicals 'TRUE' and 'FALSE' by default, but are not reserved words
    and hence can be overwritten by the user.  Hence, one should always use
    'TRUE' and 'FALSE' for the logicals.

Reviewer 1: @jacobpwagner Reviewer 2: @lmweber Due date: April 22, 2019

geanders commented 5 years ago

Also, @DillonHammill, could you add the following badge to your package GitHub repository README to show that it's currently under review at ROpenSci?

[![](https://badges.ropensci.org/281_status.svg)](https://github.com/ropensci/software-review/issues/281)
DillonHammill commented 5 years ago

Thanks for the feedback @geanders.

I have addressed the following today:

  1. Updated the installation instructions in the README to install flowCore, flowWorkspace and openCyto from BioConductor.
  2. Changed the installation order to first install CytoRSuiteData and then CytoRSuite.
  3. Looks like the release versions of flowCore, flowWorkspace and openCyto contain all the features I need. I have updated the DESCRIPTION file to remove required commit of BiocGenerics.
  4. XQuartz is required for the functioning of CytoRSuite, it is responsible for opening interactive plotting windows to facilitate gate drawing on Mac. Windows and Linux machines will use the equivalent windows() and X11() functions from grDevices.
  5. Added en-GB to language in DESCRIPTION.
  6. Fix some spelling errors.

Comments on goodpractice():

  1. I will try to include more unit tests for the package, I have been trying to stick to a 1 minute run for unit tests. If the test running time is not an issue I will aim to add more. There is also some testing for the shiny application spillover_edit which is not included in the code coverage (~5%).
  2. gate_web_draw is quite a complex function, I will see what I can do to simplify it.
  3. Not sure about the pkgdown error, I don't seem to get this on my windows machine.
  4. In terms of package dependencies, flowCore, flowWorkspace and openCyto contain a lot of the tools required at various stages throughout the analysis pipeline. These packages are maintained by the same group to ensure that there are no conflicts. For this reason I have added these to Depends field of the DESCRIPTION file.
  5. I have also had to import the entire grDevices library as I require the platform-specific functions windows(), X11() and quartz(). If I explicitly import windows() or X11() this will cause problems on a mac where these functions are not included in the grDevices libraries.
  6. I have located where the T/F was used and corrected this accordingly.

Questions:

  1. I was wondering if I am still able to submit the package to BioConductor as well?
  2. I am working on a publication, when this is complete can it be used for submission to JOSS?

Thanks for your help. Dillon

geanders commented 5 years ago

@DillonHammill, thanks for your changes in response to the initial checks!

To answer your two questions:

  1. Yes, ROpenSci packages can be crosslisted on either CRAN or Bioconductor. Submitting your package to Bioconductor would be fine.
  2. Yes, you could submit this to JOSS. You may, however, want to take a look at the author guidelines for JOSS (https://joss.readthedocs.io/en/latest/submitting.html) as well as some example papers to make sure the format matches well with the manuscript you are writing. Most JOSS papers are shorter (250-1000 words), so if you're working on a longer paper, you might want to think about whether this would be the best fit.

To continue the discussion on one of your comments for the "goodpractices":

Comment 4: It's definitely fine to include these package dependencies. This question is whether they could be included as "Imports" in the DESCRIPTION file rather than "Depends"? Here is some guidance on this from Wickham's R Packages:

"Unless there is a good reason otherwise, you should always list packages in Imports not Depends. That’s because a good package is self-contained, and minimises changes to the global environment (including the search path). The only exception is if your package is designed to be used in conjunction with another package. For example, the analogue package builds on top of vegan. It’s not useful without vegan, so it has vegan in Depends instead of Imports. Similarly, ggplot2 should really Depend on scales, rather than Importing it."

I'd love to hear your thoughts on whether CytoRSuite is in this category of "building on" flowCore, flowWorkspace, and openCyto rather than just using some tools from them. That could help in determining whether "Depends" or "Imports" is better to use in this case.

My apologies on the delays in getting reviewers assigned to this. It's taking a bit of time to find reviewers who will be good fits, but I've got some good leads and am continuing to work on getting two reviewers assigned.

DillonHammill commented 5 years ago

@geanders, sorry for not getting back to you sooner.

CytoRSuite is definitely designed to be used in conjunction with flowCore, flowWorkspace and OpenCyto (as in the above description). In particular, CytoRSuite builds on OpenCyto which depends on both flowCore and flowWorkspace. I written CytoRSuite in such way as to prevent clashes with these other packages.

I will start working on a paper for JOSS submission in the meantime.

Any news on the reviewer front?

jacobpwagner commented 5 years ago

@DillonHammill, my review is just about ready so I should be able to get it posted either today or tomorrow.

geanders commented 5 years ago

@DillonHammill : Thanks for the clarification on the Depends / Imports question! We have one reviewer (@jacobpwagner), but I am still looking for the second.

jacobpwagner commented 5 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: ~5


Review Comments

CytoRSuite adds very helpful interactive interfaces over the core infrastructure of flowCore, flowWorkspace, and openCyto. The point-and-click interface for gate construction (gate_draw() methods) and the ability to easily see the effects of changes to spillover matrix entries (spillover_edit()) greatly enhance the approachability and usability of the RGLab packages. This is important for users performing a good amount of manual gating or more accustomed to analysis using a GUI.

Additionally, the plots generated by cyto_plot() are both useful and aesthetically pleasing. It's clear that you took some care to choose a vibrant set of color schemes that would yield good contrast and highlight density nicely. Other multiple-plotting methods (i.e. cyto_plot_profile()) are similarly helpful.

For the most part, everything ran smoothly. However, I ran in to a few minor issues:


Vignettes

Your vignettes cover all of the major functionality of the package and are very helpful. Some of the example code does run in to some problems when run by the user:

You can avoid the error by providing a dummy "title" argument, but ultimately
this is coming down to line 2472 in `cyto_plot_internal.R` where in the absence
of a title, you assign it like so:

if (missing(title)) { title <- fr@description$GUID }

The problem is that the `flowCore` method for coercing a `flowSet` to a `flowFrame`
does not assign a value to `description$GUID`, so you end up with `NULL`. There may
be other related issues beyond the problem for the vignette, as a quick grep shows 
that there are 15 other lines directly accessing the `GUID` element from the 
`description` slot of an object in this `@description$GUID`manner. You should 
just make sure that the object in each case will actually have that. A better 
approach may be using `flowCore::identifier()` which will do those checks for you 
and at least give you "anonymous" if it doesn't find anything.

* `CytoRSuite.Rmd` and `CytoRSuite-Gating-Functions.Rmd` both mention sealing
polygon gates in `gate_draw()` by right-clicking and selecting "stop". The
"stop" button doesn't seem to be present on Linux (I just closed the window)
but that's really minor and I'm guessing users can figure it out.

---

#### Code
The code is generally very readable and  if anything over-commented so it was easy
to follow and generally seems sound. A few minor issues:

* I was working with a fresh R 3.5.3 installation on Linux and `X11.options()$type` was
defaulting to "Xlib" instead of "cairo" (which should be the default when available), which meant that your `X11()` calls resulting from any case where `popup == TRUE` in `cyto_plot()` were yielding windows for which `dev.capabilities()$semiTransparency` was `FALSE`. That had a bunch of downstream effects due to `alpha` arguments, including warnings ("semi-transparency is not supported on this device") and disabling some features like the plots in the "Plots" tab of the Spillover Matrix Editor. While I quickly figured out the cause and fixed it by `X11.options(type="cairo")`, the message and problems are unclear enough that other users may get confused. 
While you could leave this up to users to troubleshoot, you could also always add a check for `dev.capabilities()$semiTransparency == TRUE` or `grepl("cairo",X11.options()$type)` before `X11()` calls. I think everything may filter down to the single `X11()` line in `.cyto_plot_window()` so guarding that with a more helpful warning may be one option.

* If a `gate_draw()` window is closed without the user actually building a gate, it
produces an error:

Error in locator(type = "o", lwd = 2, pch = 16, col = "red") : graphics device closed during call to locator or identify

In terms of end result, it seems fine. No gate is added, which is what I'd expect.
However, maybe it would be good to catch/suppress the error for a graceful exit.

* In the `cyto_plot()` methods, there are at least two layers of function calls. First,
a `cyto_plot()` layer that mostly determines whether it should call the 1D or 2D internal
plotting method, and then the appropriate 1D or 2D internal method. Each of these layers
provides the full set of default arguments and `cyto_plot()` passes each along in a long call
to `cyto_plot_1d()` or `cyto_plot_2d()` but it looks like essentially the only argument 
that is relevant to the first layer is `channels`. For the sake of code maintenance it
may be easier if you can make it work by just using the `channels` argument in the `cyto_plot()`
layer and just passing the rest along to `.cyto_plot_1d()` or `.cyto_plot_2d()` in `...`. Then you would have only one copy of the list of default arguments. But you may have a reason for doing it this way.

* There are a few methods in `helper-functions.R` that include an `else` case with some apparently magic numbers.
I'm thinking of something like line 229 in `cyto_channel_select()`:

... if (getOption("CytoRSuite_interact") == TRUE) { menu(choices = opts, graphics = TRUE) } else {

    # Test channels - 7AAD, AF430, APC Cy7, NIL, PE Cy7, PE
    c(4, 7, 11, 12, 5, 2)[match(x, pData(fs)$name)]
  }

...


I'm guessing this is related to some of the test data you use in the vignettes from `CytoRSuiteData`,
but it seems like it may be poor practice to hard-code it in to a more general method, especially if
it should be throwing an error for being called in a non-interactive context. If you need
this default for testing, there may be a better way. For example, you could alter the method to allow 
specification of a set of channels. But again, there may be some other reason for this.

* `CytoRSuite::spillover_compute()` has a good amount of overlap with `flowCore::spillover()` (the `flowSet` method). There are some significant peripheral differences, but I will probably run a few test cases to make sure the core computation is coming out the same as users may expect this.

---

#### Documentation
Your documentation looks nice and thorough and the links work well for making navigation easy. I verified that all exported methods have documentation with examples and ran a fair amount of the examples to make sure they worked as expected. R CMD check didn't raise any warnings or errors. My only suggestion would be to maybe break up some of the longer "Details" sections in to a few paragraphs to improve readability (`spillover_edit` for example).

---
Other than that, the package looks great. I may continue to run it through some testing and I will update this with any issues I run in to. I look forward to seeing what else you will be adding and please feel free to contact me with issues of flowCore/flowWorkspace/openCyto compatibility.
geanders commented 5 years ago

@jacobpwagner : Thanks so much for this very helpful and thorough review!

DillonHammill commented 5 years ago

Thanks for taking the time to review CytoRSuite @jacobpwagner. I will post some comments and make the appropriate changes tomorrow. I will also push some new features if you would like to try them out.

DillonHammill commented 5 years ago

@jacobpwagner thanks again for your help!

I have made a number of changes to CytoRSuite over the last couple of days. I have committed these to the testing branch until I am convinced that everything is stable. Please install from this branch to see if I have resolved the issues you have raised.

Changes:

New features & Improvements:

Comments:

I will start working on a paper for JOSS submission this week.

Please let me know if you encounter any additional issues.

geanders commented 5 years ago

@jacobpwagner : Thanks so much for the quick response to @jacobpwagner comments! A second (and final) reviewer is now assigned (@lmweber), so there will be some comments coming in over the next few weeks from him, as well.

jacobpwagner commented 5 years ago

Thanks @DillonHammill. I'm sorry for the delay as I was out of the office on vacation. I will work with the testing branch for a while and look over the changes to get back to you as soon as possible.

DillonHammill commented 5 years ago

Thanks @jacobpwagner. I have noticed an issue in offsetting labels in multi-panel plots, I will work on a fix and push the update tomorrow.

jacobpwagner commented 5 years ago

@DillonHammill, here are my comments after the recent changes

Testing changes to fix issues raised

Instead, you could just mock the `menu` call using `mockery::stub()` or `mockery::mock()`. I've just changed over the first check here after getting rid of the "CytoRSuite_interact" conditional with the test-case code for the flowFrame method and it passes just fine.

test_that("cyto_channel_select", { print(pData(Comp)) mock_menu <- mock(5) with_mock(menu = mock_menu, expect_equal(cyto_channel_select(Comp[[1]]), "PE-Cy7-A")) expect_equal(cyto_channel_select(Comp), c("7-AAD-A", "Alexa Fluor 430-A", "APC-Cy7-A", "Unstained", "PE-Cy7-A", "PE-A")) expect_equal(cyto_channel_select(gsc), c("7-AAD-A", "Alexa Fluor 430-A", "APC-Cy7-A", "Unstained", "PE-Cy7-A", "PE-A")) })

where now the method just looks like this (no more hard-coded numbers for testing):

setMethod(cyto_channel_select, signature = "flowFrame", definition = function(x) {

        # Assign x to fr
        fr <- x

        opts <- cyto_fluor_channels(fr)

        # Print sample name and select channel
        message(
          paste(
            "Select a fluorescent channel for the following csample:",
            fr@description$GUID
          )
        )
        channel <- opts[menu(choices = opts, graphics = TRUE)]
        return(channel)
      }

)


Side note: That `fr@description$GUID` may be another good candidate for `identifier(fr)`, but I didn't really explore that.

#### Other comments
* The logic for the `cyto_plot` argument lists makes sense. I thought there was probably a good reason and I agree that it is good to make the arguments available for autocomplete. It's up to you whether you want to comment the deeper layer of default args to make it clear that they will be overridden by the first layer.

* No worries about the differences between `spillover_compute()` and `flowCore::spillover()`. I just thought I'd note it as a reminder (mostly to myself) to test their agreement in more detail. But the method looks sound.

* I haven't had time to exhaustively check the new features and improvements, but if I encounter any issues I will let you know.

#### Summary
Most of the issues I raised were addressed. The X11 issue is still there and is the only thing I pointed out that still really needs to be fixed, but the fix is pretty easy as I discussed. It's up to you whether you want to pull the test code out of your methods as I described, but as a general habit it will make it easier for collaborators to work with improving your methods if they do not have test code incorporated in them.
DillonHammill commented 5 years ago

@jacobpwagner thanks again for your help!

I have pushed the fix for X11() which should now work as expected.

I did not know about mockr I will look at updating my unit tests in this manner later today. I will also do a complete sweep of the package to replace all instances of fr@description$GUID with identifier().

jacobpwagner commented 5 years ago

Cool. Yeah, it looks like there are still a few left. I don't know that any of these will cause problems, but it may be safer to switch them all over to identifier() calls:

grep -r "\$GUID" CytoRSuite/R/
CytoRSuite/R/cyto_plot_compensation-methods.R:    nm <- fr@description$GUID
CytoRSuite/R/cyto_plot_compensation-methods.R:      header <- fr@description$GUID
CytoRSuite/R/helper-functions.R:                fr@description$GUID
CytoRSuite/R/plotting-functions.R:            x@description$GUID
CytoRSuite/R/cyto_plot_profile-methods.R:      title <- fr@description$GUID
CytoRSuite/R/cyto_stats_compute-methods.R:              rownames(cnt) <- fr@description$GUID
CytoRSuite/R/cyto_stats_compute-methods.R:              rownames(sts) <- fr@description$GUID
CytoRSuite/R/cyto_stats_compute-methods.R:              rownames(sts) <- fr@description$GUID
lmweber commented 5 years ago

Hi @DillonHammill , thank you for submitting this package. My review comments are pasted below:


Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

For packages co-submitting to JOSS

The package contains a paper.md matching JOSS's requirements with:

  • [ ] A short summary describing the high-level functionality of the software
  • [ ] Authors: A list of authors with their affiliations
  • [ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
  • [ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Functionality

Final approval (post-review)

Estimated hours spent reviewing: 5


Review Comments

Overall comments: This package provides a set of tools for analyzing flow cytometry data in R, specifically for compensation, interactive gating, and visualization. This fills a clear need in the flow cytometry data analysis community for performing manual gating and related tasks more easily within R, instead of using commercial platforms (currently, tasks such as pre-processing and more automated analysis pipelines are often performed in R, while manual gating and related tasks are usually done using commercial platforms such as Cytobank and FlowJo). The package has extensive documentation, including several vignettes with examples of workflows. The package is also well integrated with existing R tools for analyzing flow cytometry data, including flowCore, flowWorkspace, and openCyto. I think this will be a very useful package for the flow and mass cytometry community, and am happy to have had a chance to look at it in some more detail.

However, I ran into several errors and other issues when trying to run the examples in the vignettes. These are described below, along with some more general suggestions.


README

Currently, the README includes the same content as the index page of the main documentation. By having the same content in two places, there is a risk that they will become out of sync in future. You could consider consolidating this by shortening the README and including some links to the main documentation pages instead.

It would also be useful to update the README to include some information on the 'Statement of need' (see checklist above), e.g., mentioning where CytoRSuite fits into the set of available analysis tools, and the intended audience. This is described nicely in the 'Scope' section in the first comment in this issue (8 Feb), so you could re-use some of that text.


Scope

One related tool that is not mentioned in the 'Scope' section above is iSEE, available from Bioconductor. iSEE is intended more generally for interactive visualization of single-cell data stored in SummarizedExperiment objects, but one of their vignettes shows how to use it for gating mass cytometry data, so there is some overlap there. However, CytoRSuite contains much more specialized functionality for flow cytometry data and builds on existing flow cytometry packages such as flowCore / flowWorkspace / openCyto, so I don't think this overlap reduces the usefulness of CytoRSuite. Both tools provide very useful functionality for users who wish to perform their full analysis pipelines within R, instead of switching to commercial products for the manual gating steps.

You could also consider mentioning (either in the README or in the main documentation) something about how this fits in with tools for analyzing higher-dimensional flow and mass cytometry data (e.g., CytoRSuite could be used for initial pre-gating and for visualizations).


Installation and checks

I installed CytoRSuite on a Mac, following the instructions on the README and main documentation pages. The installation completed without errors.

I also ran devtools::check(), however this returned a few errors and notes, which should be addressed.

Quitting from lines 83-84 (CytoRSuite-Visualisation.Rmd) 
   Error: processing vignette 'CytoRSuite-Visualisation.Rmd' failed with diagnostics:
   argument is of length zero
   Execution halted

Due to this error, I switched to the master branch to run the rest of the checks. This still gave the following NOTES:

N  checking package dependencies (2.2s)
   Packages suggested but not available for checking:
     ‘vdiffr’ ‘shinytest’ ‘covr’ ‘pkgdown’

These packages are listed in the Suggests: field in the DESCRIPTION, so I thought they would be installed automatically when using devtools::install_github() with dependencies = TRUE. I'm not sure why this failed. After installing the packages manually, the NOTE was removed. This would probably also be fixed by distributing the CytoRSuite package through Bioconductor.

N  checking installed package size ...
     installed size is 31.7Mb
     sub-directories of 1Mb or more:
       doc   28.2Mb
       help   3.1Mb

The largest files are the .gif files in docs/articles/Gating-functions. I think this is probably fine; although you could consider reducing the size of these files, e.g., by downsampling the resolution, as long as this does not affect the visual quality in the documentation.

N  checking R code for possible problems (33.9s)
   Found an obsolete/platform-specific call in the following function:
     ‘.cyto_plot_window’
   Found the platform-specific devices:
     ‘X11’ ‘quartz’
   dev.new() is the preferred way to open a new device, in the unlikely
   event one is needed.
   Found the following assignments to the global environment:
   File ‘CytoRSuite/R/gatingTemplate-modifiers.R’:
     assign(deparse(substitute(x)), gs, envir = globalenv())
     assign(deparse(substitute(x)), gs, envir = globalenv())
   File ‘CytoRSuite/R/helper-functions.R’:
     assign(deparse(substitute(x)), cyto, envir = globalenv())

I'm less sure about these; it would be good if they can be addressed, but I don't have any specific advice about how to fix them.


Functionality

I tested the functionality by trying to run through the examples in the vignettes. However, I ran into a number of problems, which will need to be addressed:

Error in if (getOption("CytoRSuite_cyto_plot_grid")) { : 
  argument is of length zero

Due to this error, I switched back to the master branch. However, this still gave several warnings and errors, including:

Warning messages:
1: In plot.window(...) : "gtfile" is not a graphical parameter
2: In plot.xy(xy, type, ...) : "gtfile" is not a graphical parameter
3: In axis(side = side, at = at, labels = labels, ...) :
  "gtfile" is not a graphical parameter
4: In axis(side = side, at = at, labels = labels, ...) :
  "gtfile" is not a graphical parameter
5: In box(...) : "gtfile" is not a graphical parameter
6: In title(...) : "gtfile" is not a graphical parameter
Error in if (is.na(title)) { : argument is of length zero
In addition: Warning messages:
1: In rep(x[[arg]], length.out = plots) :
  'x' is NULL so the result will be NULL
2: In rep(x[[arg]], length.out = n) :
  'x' is NULL so the result will be NULL

Also, an additional minor issue was that at first, I didn't realize that right-clicking was required to complete the polygons and continue to the next step of the gating. I think this is mentioned in some of the code comments, but it would be good to mention it more prominently at the start.

I am happy to try running these vignette examples again once the errors and warnings have been addressed.


Documentation

As mentioned above, the package includes extensive vignettes, which is great for new users. The package also includes help files for the individual functions.

One suggestion I have for the function help files is regarding places where you have several help files covering very similar content for related functions (e.g., gate_polygon_draw(), gate_rectangle_draw(), gate_boundary_draw()). For these functions, you could consolidate the help files into a single page using @rdname in Roxygen (e.g., see the instructions in section 'Documenting multiple functions in the same file' in Hadley Wickham's R Packages book). This would make it easier for you to maintain, and simpler for the reader to get an overview.

I also ran into a problem with the examples in the first vignette (i.e., the examples in the README and on the index page of the main documentation). These examples begin with the code:

library(CytoRSuite)
library(CytoRSuiteData)

# Save .fcs files to folder "Compensation Controls" in working directory
files <- list.files(path = "Compensation Controls", full.names = TRUE)
fs <- read.ncdfFlowSet(files = files)

If I understand it correctly, the intention here is that the reader saves some .fcs files in the directory Compensation Controls, and then continues with the example. However, it would be much more useful if some .fcs files were provided along with the package. Alternatively, you could update these examples to use one of the datasets from the CytoRSuiteData package, similar to the other vignettes.


Code quality

The code looks well-structured and commented. If you also submit the package to Bioconductor, you will need to check that you meet Bioconductor's guidelines regarding code style.


Additional suggestions

Bioconductor: I would strongly recommend also submitting this package to Bioconductor. The package already builds on other Bioconductor packages, so this should be feasible. Making the package available through Bioconductor would make it easier to find, and give it extra credibility in the flow cytometry community. However, it would need go through another set of reviews and meet all of Bioconductor's requirements. I ran R CMD BiocCheck . on the package directory, and it flagged a few additional issues (see the Bioconductor package submission guidelines).

Publication strategy: In the comments above, you mentioned submitting the package to JOSS. Of course the publication strategy is completely up to you; but I wanted to mention that I don't think this would be the best place for this package. JOSS papers are very short papers describing the purpose of the package only -- which is fine for many packages -- but for this package, I think it would be more useful to have a somewhat longer paper describing some of the key functionality, maybe along with a short workflow, and examples of visualizations. This would help convince flow cytometry users who are currently using commercial platforms (Cytobank, FlowJo, etc) that CytoRSuite could be a good alternative for them. Some possible venues for this type of paper could be the Bioconductor gateway in F1000Research, or more traditional flow cytometry journals like Cytometry Part A.

DillonHammill commented 5 years ago

Thanks for your review @lmweber!

Sorry the testing branch was not working as expected - I am working on some major updates and have not yet had time to push all the changes. I am working hard to refactor a lot of the code so I can implement mockery to improve the test coverage for the package.

I will need a week or so to finish implementing all of these new features and changes. I will merge them back to the master and let you know when everything is ready for a second look.

lmweber commented 5 years ago

Ok no worries, just let me know when it is ready to have another look, thanks!

geanders commented 5 years ago

@lmweber : Thanks so much for your very helpful review!

@jacobpwagner : I wanted to briefly check in---it looks like many of your initial concerns were addressed. Are you satisfied that the X11 concern is now addressed, or is that still something to keep an eye on through the remaining revisions?

Also, thanks @DillonHammill for your quick and thorough revisions based on these reviews. I will keep an eye out for the revisions in response to @lmweber 's review.

jacobpwagner commented 5 years ago

@geanders, it looks fixed in the testing branch, so as soon as that gets merged to master it should be good. I can hold off and look over everything once more after @DillonHammill has merged this batch of changes to master.

jacobpwagner commented 5 years ago

@DillonHammill I think you've seen that the API for flowCore, flowWorkspace, and openCyto is getting a bit of an overhaul, but it's mostly just method renaming with the old names being deprecated. The changes were merged to Bioconductor after the 3.9 branch was created, so for now it shouldn't affect users installing those dependencies following your installation instructions. However, if users install the dependencies from GitHub, they will get deprecation warnings from some of the methods you call. Anyway, I'd be happy to put together a PR just changing over the relevant method calls in your code that you could merge for the next Bioconductor release when the changes will be incorporated.

DillonHammill commented 5 years ago

Thanks @jacobpwagner that would be great! Sorry for the delay, I am doing a complete overhaul of CytoRSuite to add some coding conventions to make it easier to add new features and find erroneous code. This is taking longer than expected but I will do my best to have it completed over the coming week. Looking forward to sharing the improvements soon. I will let you know when everything is merged to master and ready for a second look.

jacobpwagner commented 5 years ago

@DillonHammill , see https://github.com/DillonHammill/CytoRSuite/pull/17, which is based off of your master branch. I can help resolve conflicts if you merge in your testing branch.

geanders commented 5 years ago

@DillonHammill : There's no rush, but I just wanted to check in and see if you have a timeline for submitting revisions?

DillonHammill commented 5 years ago

Sorry I have not been in contact for a while @geanders. Given all the feedback I received from @lmweber and @jacobpwagner (and others) I have completely re-written the entire package. I found a number of bugs that were difficult to track down, so I have spent a lot of time modularizing and re-factoring a lot of the code. This was a lot of work but it was definitely worth it - I have added a number of new features and it is easy to add more features in the future. I am currently working on finalising a couple of important features and then I will tackle the code coverage, continuous integration and documentation for the website. I will do my best to have everything ready as soon as possible. Thanks for patience, I think it will be worth the wait.

jacobpwagner commented 5 years ago

Sounds good. I'll keep an eye on this and be ready to review the reworked package as soon as it's ready.

geanders commented 5 years ago

Thanks for the update, @DillonHammill ! That all sounds good to me.

geanders commented 4 years ago

@DillonHammill : I just wanted to check in again. No worries if you're still working on this, but just so you're aware, we typically change the status to "holding" on review Issues where there are longer overhauls, so that it's clear to the editor in chief that there's some longer-term work on the package as he or she reviews all the open issues. Please let me know if you're anticipating that there will still be a while before adding responses on this package, in which case I'll change its status to "holding", or if you anticipate getting things in soon, in which case I'll leave it as-is.

DillonHammill commented 4 years ago

@geanders : Thanks for checking in! The package is almost complete! I am planning on re-submitting early next week. I am in the process of updating the website with new workflows so that @jacobpwagner and @lmweber have something to follow. I will be testing the package over the weekend and improving the test coverage. If I am happy with the stability of the package I will submit it again next week. Sorry for taking so long, hopefully it will be worth the wait!

DillonHammill commented 4 years ago

@geanders : Unfortunately, due to recent changes in the underlying data structures in the RGLab dependencies of this package (https://github.com/RGLab/flowWorkspace/issues/294 ), I am going to need some more time to get everything on par with the RGLab suite of packages. Everything is ready on my end, I will just need to make some changes as to how the data is handled internally to ensure that the package behaves as it used to. I don't see a point in re-submitting the package until it is up to date with the latest RGLab packages. I will do my best to make all the necessary changes over the next week or so. Apologies for further delaying this re-submission, but I think it is essential that I make these changes now. I am happy for you to put this hold whilst I get everything ready. I should have mentioned that I have moved the package to a private repo whilst I finalise everything, so it may appear as though I am not making any changes. I will make this repo public when everything is ready.

geanders commented 4 years ago

@DillonHammill : I am putting the "holding" tag on this submission for the moment, as that will help our editorial team as they keep track of where all the packages are in the review process. We'll change this back as soon as you're ready to resubmit.

noamross commented 4 years ago

⚠️⚠️⚠️⚠️⚠️

In the interest of reducing load on reviewers and editors as we manage the COVID-19 crisis, rOpenSci is temporarily pausing new submissions for software peer review for 30 days (and possibly longer). Please check back here again after 17 April for updates.

In this period new submissions will not be handled, nor new reviewers assigned. Reviews and responses to reviews will be handled on a 'best effort' basis, but no follow-up reminders will be sent.

Other rOpenSci community activities continue. We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other.

The rOpenSci Editorial Board

⚠️⚠️⚠️⚠️⚠️

annakrystalli commented 4 years ago

⚠️⚠️⚠️⚠️⚠️ In the interest of reducing load on reviewers and editors as we manage the
COVID-19 crisis, rOpenSci new submissions for software peer review are paused.

In this period new submissions will not be handled, nor new reviewers assigned.
Reviews and responses to reviews will be handled on a 'best effort' basis, but
no follow-up reminders will be sent. Other rOpenSci community activities continue.

Please check back here again after 25 May when we will be announcing plans to slowly start back up.

We express our continued great
appreciation for the work of our authors and reviewers. Stay healthy and take
care of one other.

The rOpenSci Editorial Board ⚠️⚠️⚠️⚠️⚠️

geanders commented 4 years ago

Hi, @DillonHammill! I hope you are doing well. We typically check in on packages that are "on hold" about every three months, to see if the author is ready to proceed, and then will usually close the issue if the package has been on hold for a year or longer.

I see that we put this package on hold in January, and while ROpenSci reduced it's normal process during this past spring, we are back to a more normal schedule now. Given that, I wanted to check in and see if you would like us to continue to have this review on hold, or if you are ready to move back into an active review? Either is fine, I just wanted to check in.

DillonHammill commented 4 years ago

Hi @geanders, thanks for checking in! That is great news, I have been waiting to re-submit for a while now. I should be able to re-submit by the end of next week if that is OK? I am working on a couple of other packages that I hope to submit as well.

geanders commented 4 years ago

Yes, that would be fine. We'll plan to change out of the "holding" tag once you resubmit and let the reviewers know it's ready for them to look at again.

jacobpwagner commented 4 years ago

I'm ready for reviewing whenever!

DillonHammill commented 4 years ago

Apologies for not getting back to you last week, our group has been working on getting a paper submitted last week so I was bogged down helping with that.

I will devote all my time this week to getting this ready for review. I am currently performing my own code review, as the package has become quite large and a lot of the code can be simplified with the use of newly created functions.

I am working on reducing package dependencies as well: https://github.com/DillonHammill/CytoExploreR/issues/66.

I have also moved all the interactive data editors to their own package https://github.com/DillonHammill/DataEditR which has been submitted to CRAN and ROpenSci https://github.com/ropensci/software-review/issues/39. This should be available on CRAN soon, so I am working on updating CytoExploreR to remove this code. Some points were brought up during the CRAN review process that are in CytoExploreR as well - so I will try to fix this as well.

I have also removed a lot of the unit tests whilst I make all these changes so I will need to add them back once everything has been updated.

I know that the annual cytometry conference is on this week (virtually) so I will try and get everything completed early next week. Thanks for your patience, it will be worth the wait!

noamross commented 3 years ago

Hello @DillonHammill, I am Noam and I'm going to be taking over as handling editor for CytoRSuite from @geanders. I'll take this opportunity to check on your status. I see that this review is in "holding" as you were working on some considerable changes last year before starting review. Do you have any updates?

DillonHammill commented 3 years ago

@noamross, thanks for checking in! Due to COVID19 CytoRSuite (now called CytoExploreR) has become the focus of my PhD project. I am working hard to submit my PhD by the end of June and I will submit the finalised version of the package as soon as I am ready.

noamross commented 3 years ago

Thanks for the update @DillonHammill. Note that with an extended hiatus we may need to find new reviewers for a future version, and especially if you are making big changes, it may effectively be re-reviewed.

emilyriederer commented 2 years ago

Hi @DillonHammill ! We are doing a sweep of stale review issues. Since this review has been open and inactive for so long, much may have changed including author, editor, and reviewer bandwidth and ever-evolving rOpenSci best practices. As such, I'm closing this issue. If you still have interest and capacity, we would welcome you to open a new submission issue!