Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

Submission of package infinityFlow #1526

Closed ebecht closed 4 years ago

ebecht commented 4 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 4 years ago

Hi @ebecht

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: infinityFlow
Title: Augmenting Massively Parallel Cytometry Experiments Using Multivariate Non-Linear Regressions
Version: 0.99.0
Authors@R: c(
     person("Etienne","Becht",middle=NULL,email="etienne.becht@protonmail.com",role=c("cre","aut"))
       )
Description: Pipeline to analyze and merge data files produced by BioLegend's LEGENDScreen or BD Human Cell Surface Marker Screening Panel (BD Lyoplates).
Depends:
    R (>= 4.0.0)
License: GPL-3
Encoding: UTF-8
LazyData: true
Imports: stats,
    grDevices,
    utils,
    graphics,
    pbapply,
    matlab,
    png,
    raster,
    grid,
    uwot,
    flowCore,
        gtools,
    Biobase,
    generics,
    parallel,
    methods
Suggests: knitr,
    rmarkdown,
    keras,
    tensorflow,
    glmnetUtils,
    e1071,
    xgboost
VignetteBuilder: knitr
RoxygenNote: 7.1.0
biocViews: Software, FlowCytometry, CellBasedAssays, SingleCell, Proteomics
bioc-issue-bot commented 4 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

lshep commented 4 years ago

Remember to set up your webhook and do a valid version bump when editing code so that the builders have a report for the most recent code changes.

ebecht commented 4 years ago

Hi @lshep

Thanks, I think I already did (or at least tried to?).

Is it OK to make changes during the review process?

lshep commented 4 years ago

Yes it is. We see a version bump required tag which normally means you committed changes to the repo without a version bump so the build is not reflecting the latest changes. Changes are expected especially after your reviewer gives feedback

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

e637ee3 Created vignette for non-XGBoost regression models

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

57fa02d Version bump after subscribing to bioc-devel

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

28b3efa Version bumped following bioc-devel subscription #...

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

84ba86d Reverting r estimation in logicle 0edfe9f Increased version number

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hpages commented 4 years ago

Hi Etienne,

Finally looking at infinityFlow. Sorry for the delay. Will provide some feedback here in the next couple hours.

H.

hpages commented 4 years ago

Overall the package looks good and is almost ready to go. Only a few minor things, mostly cosmetic:

  1. Please remove the doc/ folder. It doesn't belong to the standard package layout. Packages can have an inst/doc folder, with static vignettes in it, but infinityFlow (like most Bioconductor packages) doesn't need one. This is because the folder will be automatically created by R CMD build and populated with basic_usage.[Rmd|R|html] and training_non_default_regression_models.[Rmd|R|html], and finally added to the resulting source tarball.

  2. The vignettes should show the standard way to install a Bioconductor package:

    if (!requireNamespace("BiocManager", quietly = TRUE))
        install.packages("BiocManager")
    
    BiocManager::install("infinityFlow")

    In particular, please do not encourage direct installation from GitHub as this is almost guaranteed to end up in a dependency mismatch nightmare (keep in mind that Bioconductor has releases every 6 months and installing packages from GitHub doesn't play nice with that).

  3. Would be nice if the Introduction of the Basic usage vignette could provide a little bit more background information about "massively parallel cytometry" and its challenges, possibly with some references to the scientific literature, as well as details about things like "LEGENDScreen or Lyoplates kits" or "the default machine learning toolkit XGBoost". Also, I'm sure everybody (except me) knows what the "Backbone and Infinity antibodies" are but a quick reminder wouldn't hurt (for people like me).

  4. The fact that in your vignettes the calls to library(infinityFlow) are always immediately preceded or followed by a call to library(flowCore) is a sign that flowCore should probably be moved from Imports to Depends. The rationale being that flowCore defines a class (flowSet) and provides functionalities that seem to be very relevant to a typical session with infinityFlow. After you've made that change, all the flowCore functionalities become available to the end user as part of loading infinityFlow so you can remove the calls to library(flowCore).

  5. Note that your users are welcome to report bugs on GitHub but general questions about the package (e.g. clarification about function usage and other help needed) are preferrably asked on the Bioconductor support site. Please update the Conlusion of the Basic usage vignette to reflect that.

  6. Add a sessionInfo() section at the end of your vignettes. It might be followed by a Bibliography section if you add some references in the document as suggested in 3.

  7. The man page for the infinity_flow() function (?infinity_flow) contains "See ?export_data for help" and "See ?correct_background for help" but these man pages don't seem to exist:

    > ?export_data
    No documentation for ‘export_data’ in specified packages and libraries:
    you could try ‘??export_data’
    > ?correct_background
    No documentation for ‘correct_background’ in specified packages and libraries:
    you could try ‘??correct_background’
  8. Since the export_data() and correct_background() functions actually seem to be relevant to the end user, they should not only be documented but also exported. This means using the infinityFlow::: notation will no longer be needed.

  9. We disourage the use of LazyData: true. Its only benefit is that it makes your datasets immediately available after the package is loaded i.e. the user does not need to do data(steady_state_lung) before they can access the steady_state_lung object. But since you're doing data(steady_state_lung) in the vignettes and man pages, which is good, you can just remove LazyData: true from your DESCRIPTION file.

  10. Coding style:

    • Please use <- rather than = for assignment, with one space before and after the arrow.
    • In my experience the 2 R CMD BiocCheck notes about using vapply() instead of sapply() and using seq_len() or seq_along() instead of 1:N are spot-on. Don't ignore them.
  11. Typo: "This This function will load ..." in ?select_backbone_and_exploratory_markers.

  12. Just as an FYI: I've expressed my frustration about the flowSet container on our community-bioc Slack (#bigdata-rep channel). No action required from your side but I wanted to give you a heads up.

Thanks, H.

ebecht commented 4 years ago

Thank you very much for the in-depth review Hervé, I'll address your points and update. These are really helpful both for enhancing the clarity of the documentation and the quality of the software.

Best, Etienne

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

82cac14 Bioconductor review: removed doc and moved flowCor... 0ea07f8 Answering Bioconductor code review queries 5fbce52 Fixed typos in vignettes d771f11 Added @usage for data() objects

ebecht commented 4 years ago
1. Please remove the `doc/` folder. It doesn't belong to the standard package layout. Packages can have an `inst/doc` folder, with static vignettes in it, but infinityFlow (like most Bioconductor packages) doesn't need one. This is because the folder will be automatically created by `R CMD build` and populated with `basic_usage.[Rmd|R|html]` and `training_non_default_regression_models.[Rmd|R|html]`, and finally added to the resulting source tarball.

I moved the HTML vignette from doc/ folder to inst/doc/. I prefer to keep a knit vignette available, at least for now, as it is the only form of online documentation currently available.

2. The vignettes should show the _standard way_ to install a Bioconductor package:
   ```
   if (!requireNamespace("BiocManager", quietly = TRUE))
       install.packages("BiocManager")

   BiocManager::install("infinityFlow")
   ```
   In particular, please do not encourage direct installation from GitHub as this is almost guaranteed to end up in a dependency mismatch nightmare (keep in mind that Bioconductor has releases every 6 months and installing packages from GitHub doesn't play nice with that).

I updated the vignette according to your suggestion. I still reference the github installation in the README on github, I will remove it when the package becomes available on Bioconductor.

3. Would be nice if the Introduction of the _Basic usage_ vignette could provide a little bit more background information about "massively parallel cytometry" and its challenges, possibly with some references to the scientific literature, as well as details about things like "LEGENDScreen or Lyoplates kits" or "the default machine learning toolkit XGBoost". Also, I'm sure everybody (except me) knows what the "Backbone and Infinity antibodies" are but a quick reminder wouldn't hurt (for people like me).

You're totally right, this was missing context, and it's not just you. I expanded the introduction of the vignette to add more background. I previously assumed that readers of the vignette would be familiar with the content of the corresponding scientific paper but I realize that it is better if the vignette is viewed as self-contained. Thanks for helping me improve the clarity of the vignette. I now briefly explain what is an MPC experiment. Here is the updated introduction:

Thank you for your interest in the ***Infinity Flow*** approach. This vignette describes how to apply the package to your massively parallel cytometry experiment (commerically-available as *BioLegend's LEGENDScreen* or *BD's Lyoplate Cell Surface Marker Screening Panel*). Massively parallel cytometry experiments are cytometry experiments where a sample is aliquoted in _n_ subsamples, each stained with a fixed panel of "Backbone" antibodies. Each aliquot is in addition stained with a unique "Infinity" exploratory antibody. The goal of the ***infinityFlow*** package is to use information from the ubiquitous Backbone staining to predict the expression of sparsely-measured Infinity antibodies across the entire dataset. To learn more about this type of experiments and details about the Infinity Flow approach, please consult [Becht et al, 2020](https://www.biorxiv.org/content/10.1101/2020.06.17.152926v1). In this vignette we achieve this by using the XGBoost machine-learning framework implemented in the (xgboost R package)[https://cran.r-project.org/web/packages/xgboost/index.html]. This vignette aims at explaining how to apply a basic ***infinityFlow*** analysis. Advanced usages, including different machine learning models and custom hyperparameters values, are covered in a separate vignette.
4. The fact that in your vignettes the calls to `library(infinityFlow)` are always immediately preceded or followed by a call to `library(flowCore)` is a sign that flowCore should probably be moved from `Imports` to `Depends`. The rationale being that flowCore defines a class (flowSet) and provides functionalities that seem to be very relevant to a typical session with infinityFlow. After you've made that change, all the flowCore functionalities become available to the end user as part of loading infinityFlow so you can remove the calls to `library(flowCore)`.

Indeed, I moved flowCore to Depends, as per your suggestion.

5. Note that your users are welcome to report bugs on GitHub but general questions about the package (e.g. clarification about function usage and other help needed) are preferrably asked on the Bioconductor support site. Please update the Conlusion of the _Basic usage_ vignette to reflect that.

Thanks, I added a link to Bioconductor's support site to the vignette. Quick question about this: will I receive an email in the event that people ask a question related to this package on the support site? I'm afraid to miss potential questions.

Here is the updated conclusion for your reference:

# Conclusion

Thank you for following this vignette, I hope you made it through the end without too much headache and that it was informative. General questions about proper usage of the package are best asked on the (Bioconductor support site to maximize visibility for future users. If you encounter bugs, feel free to raise an issue on infinityFlow's [github](https://github.com/ebecht/infinityFlow/issues).
6. Add a `sessionInfo()` section at the end of your vignettes. It might be followed by a Bibliography section if you add some references in the document as suggested in 3.

Done for both vignettes.

7. The man page for the `infinity_flow()` function (`?infinity_flow`) contains "See ?export_data for help" and "See ?correct_background for help" but these man pages don't seem to exist:
   ```
   > ?export_data
   No documentation for ‘export_data’ in specified packages and libraries:
   you could try ‘??export_data’
   ```

Please see below.

   ```
   > ?correct_background
   No documentation for ‘correct_background’ in specified packages and libraries:
   you could try ‘??correct_background’
  ```
8. Since the `export_data()` and `correct_background()` functions actually seem to be relevant to the end user, they should not only be documented but also exported. This means using the `infinityFlow:::` notation will no longer be needed.

Answering both 7 and 8 here: I updated the ?infinity_flow man page to remove references to export_data() and correct_background(). I think it is better to not export these functions as the wrapper function infinity_flow() calls them towards the end of execution. The idea is that users input FCS files, and export_data() and correct_background() will be called at the end of the pipeline to provide the user with augmented FCS files. There is an option for no export, in the event that users do not want to export their data as FCS or CSV files, which is now documented in ?infinity_flow when describing the extra_args_export and extra_args_correct_background arguments. Of note, the infinity_flow() function also returns R objects if users prefer to work within R.

9. We disourage the use of `LazyData: true`. Its only benefit is that it makes your datasets _immediately_ available after the package is loaded i.e. the user does not need to do `data(steady_state_lung)` before they can access the `steady_state_lung` object. But since you're doing `data(steady_state_lung)` in the vignettes and man pages, which is good, you can just remove `LazyData: true` from your DESCRIPTION file.

I changed it to LazyData: false in the DESCRIPTION file.

10. Coding style:

* Please use `<-` rather than `=` for assignment, with one space before and after the arrow.

Done

* In my experience the 2 `R CMD BiocCheck` notes about using `vapply()` instead of `sapply()` and using `seq_len()` or `seq_along()` instead of `1:N` are spot-on. Don't ignore them.

I replaced calls to sapply by either vapply or lapply, and seq_len instead of 1:n.

11. Typo: "This This function will load ..." in `?select_backbone_and_exploratory_markers`.

Thanks, I fixed it.

12. Just as an FYI: I've expressed my frustration about the flowSet container on our community-bioc Slack (#bigdata-rep channel). No action required from your side but I wanted to give you a heads up.

Noted, I will make sure to maintain usability if anything gets updated.

Thanks, H.

Thanks again for taking the time to test and review the package, it is very helpful to have an external pair of eyes looking at it.

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

ebecht commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Since the Warnings are related to the static vignettes in doc/inst, please let me know if they are acceptable. Thank you! Edit: trying to fix it using .Rbuildignore

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

d193120 Added inst/doc to .Rbuildignore 0df8ff6 Re-added inst/doc vignettes and fixed .gitignore

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

b536d96 Fixed Perl regular expression for inst/doc in .Rbu... 78bdcea Version increase to trigger Bioconductor build

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

ac1b92b Reverted handling of doc/ which is added to .Rbuil...

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

c1ced76 Fixed broken link in Readme.md

ebecht commented 4 years ago

@hpages I answered your comments in a previous reply, however one caused more issues than I originally thought.

Your first comment was about removing the doc/ folder, and I tried moving the .html vignettes to inst/doc because I would like to keep an html version on the github that I can link to provide online documentation to interested users. However moving the vignettes to inst/doc while having source vignettes in vignettes seems to cause a WARNING from R CMD build.

I tried circumventing the warning by adding inst/doc to .Rbuildignore but that caused more errors than it fixed, actually resulting in errors during building of the package.

So I think the best way to both keep an html version of the vignette without polluting the resulting tarball is to leave the doc/ folder, keeping in mind that it is excluded in .Rbuildignore and therefore will not be present in the built package. It should only be present on github. R CMD check --as-cran throws a WARNING if it finds doc/ in the tarball, which is not the case.

Please let me know if you feel unsatisfied about how I handled doc/, and feel free to let me know if you have other concerns about how I handled other points.

Thank you! Etienne

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hpages commented 4 years ago

Hi @ebecht ,

Thanks for all the changes.

Hosting a static HTML vignette on GitHub and pointing your users to it is not a good idea and something with strongly discourage. A static HTML vignette on GitHub has 2 problems:

  1. It is very likely to contain stale code.
  2. It is generally out-of-sync with the current release version of your package. Remember that Bioconductor has 6-month release cycles with a notion of stable release.

Once the package is accepted, it will have a landing page on our website with links to the HTML vignettes so you'll be able to point your users to them. For example, the permanent link to the landing page of the SummarizedExperiment package is https://bioconductor.org/packages/SummarizedExperiment and for infinityFlow it will be https://bioconductor.org/packages/infinityFlow

Unlike with a static HTML vignette on GitHub, the permanent link to an HTML vignette on the Bioconductor landing page is guaranteed to always remain in sync with the release version of your package and to contain valid code.

Please remove the doc/ folder from your package source tree.

Thanks, H.

bioc-issue-bot commented 4 years ago

Received a valid push; starting a build. Commits are:

7584522 Removing ./doc/ folder - Removed all the files fr...

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/infinityFlow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

ebecht commented 4 years ago

Thanks for the feedbacks @hpages

I understand your point about Bioconductor releases and GitHub versions of the packages being potentially out of sync. I removed doc/ from the package's git repository.

Feel free to let me know if there is anything else. Best, Etienne

hpages commented 4 years ago

Parfait, merci ! Package is good to go.

H.

bioc-issue-bot commented 4 years ago

Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be posed to this issue in the next several days.

Thank you for contributing to Bioconductor!

ebecht commented 4 years ago

Thank you for reviewing @hpages !

Etienne

mtmorgan commented 4 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/ebecht.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("infinityFlow"). The package 'landing page' will be created at

https://bioconductor.org/packages/infinityFlow

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.