bv2 commented 4 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/bioFAM/MOFA2

Confirm the following by editing each check box to '[x]'

[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

I am familiar with the essential aspects of Bioconductor software management, including:

[x] The 'devel' branch for new packages and features.
[x] The stable 'release' branch, made available every six months, for bug fixes.
[x] Bioconductor version control using Git (optionally via GitHub).

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 4 years ago

Hi @bv2

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: MOFA2
Type: Package
Title: Multi-Omics Factor Analysis v2
Version: 0.99.0
Maintainer: Britta Velten <britta.velten@gmail.com>
Author: Ricard Argelaguet, Damien Arnol, Danila Bredikhin, Britta Velten
Date: 2018-09-03
License: GPL (>= 2)
Description: The MOFA2 package contains a collection of tools for running and analysing MOFA models.
Encoding: UTF-8
Depends: R (>= 4.0)
Imports: rhdf5, dplyr, tidyr, reshape2, pheatmap, ggplot2, methods, GGally, RColorBrewer, cowplot, ggrepel, reticulate, HDF5Array, grDevices, stats, magrittr, forcats, utils, corrplot, ggrastr, DelayedArray, Rtsne, uwot
Suggests: knitr, testthat, Seurat, ggpubr, doParallel, foreach, psych, MultiAssayExperiment
biocViews: DimensionReduction, Bayesian, Visualization
VignetteBuilder: knitr
LazyData: false
NeedsCompilation: yes
RoxygenNote: 7.1.1
SystemRequirements: Python (>=3), numpy, pandas, h5py, scipy, argparse, sklearn, mofapy2

bioc-issue-bot commented 4 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/MOFA2 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bv2 commented 4 years ago

Hi @LiNk-NY, thank you for taking care of our package! Currently, the package fails during the build as it relies on the python package mofapy2 as specified in the DESCRIPTION. mofapy2 is available on https://pypi.org/project/mofapy2/. The package builds without errors on systems, where mofapy2 is installed but here it is missing this dependency. Would it be possible to add this dependency to the builder? Thanks!

alanocallaghan commented 4 years ago

@bv2 - I'm not part of the bioc team so take this with a grain of salt. However you may have better luck using basilisk to manage your python environment using conda.

LiNk-NY commented 4 years ago

Hi Britta, @bv2 Thank you for your submission to Bioconductor. Please see the review below. If you have any questions, feel free to post them here.

Best, Marcel

MOFA2 #1661

Should MOFA go under deprecation procedures?
Why not release MOFA version 2.0.0 instead?
As @Alanocallaghan said, consider using basilisk for possibly smoother integration to python.
Consider inheriting from a standard Bioconductor representation (e.g., SummarizedExperiment, SingleCellExperiment, etc.). The main representation could benefit from having a more structured format including accessor functions.

DESCRIPTION

Preferably use Authors@R field for more detailed authorship attributions as well as inclusion of ORCID identifiers.
Include a BugReports field pointing to the GitHub issues location.

NAMESPACE

My preference is to avoid using names that are application specific and use or reuse a more general term (e.g., using sample_names vs colnames, features_names vs rownames, and factor_names vs levels) but this is optional.
Where possible, reduce the function name length to more manageable names.
Look into the possible reuse of the plot generic if there are package classes involved in plot_*.
Consider re-exporting the %>% if you intend your users to use the pipe as well.

vignette

Avoid using the @ accessor and use a more formalized interface with such as samples_metadata(model) or even metadata(model) if you inherit from the Annotated class. (instead of model@samples_metadata; applies to model@cache as well).
Update any commented code (Line#204)
Keep the text within the 80 character width limit.
Remove any HTML documents from the vignettes folder

R

Use || to ensure that conditions on either side return a single logical value.
(optional) Show method can be condensed based on the values of object@status and nfactors
Remove any old / commented code (R/AllGenerics.R and others).
Where possible, use vectorized operations (e.g., lapply(r2_mk, function(x) x*100) could be unlist > *100 > relist operation
Where possible, avoid nesting apply family loops (R/calculate_variance_explained.R)
If models were a formalized S4 class with validity checks, you wouldn't have to use "sanity checks" in compare_elbo and select_model
Avoid using class(x) == "character". As seen with the matrix class, it is better to use the is.character function (R/correlate_covariates.R).
Check that MultiAssayExperiment namespace is available (using requireNamespace) in .create_mofa_from_mae function otherwise return an error to the user and ask them to install the package.
Setter functions (e.g., names(x)<-value) can be used to 'initialise' the MOFA object (rather than using @).
Provide users a way to turn the messages off (R/enrichment.R).
Is there a reason for using doParrallel insted of BiocParallel? I do not know what the differences are.
Checking if the class .hasSlot(x, 'on_disk') should not be necessary, unless there are old serialized instances of this class. If so, you should have an updateObject method as well.
Perhaps the make_example_data function (not exported) should be either in data-raw or inst/extdata / inst/scripts and saved as a serialized dataset in the package using usethis::use_data.
Because the package includes a lot of functions, consider separating the plotting and the infrastructure into separate packages.
Provide additional details for the user in some functions (e.g., get_default_tranining_options) Ideally, the documentation should answer questions like 'What is the function for?', 'At what point in the process is it used?' etc.
Avoid using semicolons ; and move the code to another line
Where possible use character instead of numeric indexing (e.g., x[1] vs x["group"])
.onAttach startup messages can become verbose if all packages are using them, avoid their use where possible.

Minor:

Use <- arrow for assignment instead of the = equals sign.
length(x) > 1 is equal to length(x) when evaluated within an if conditional
Avoid using isTRUE(x) on arguments that are logical, simply use if (x) (R/correlate_covariates.R).
Condense message functions into one (R/enrichment.R).

/ pkg folder

Remove scripts template_script* from this folder and possibly place them in inst/scripts.
images should be in the vignettes folder for use
I recommended that you create a separate repository for mofapy2 and only keep relevant R code in the repository.
Remove the setup.py script.

tests

It's good that you have quite a few unit tests. The recommended way of adding tests is according to file name (e.g., test_enrichment.R for testing enrichment.R) but this is optional
Avoid downloading resources from external websites such as GitHub these can change at any time (in test_creating_model.R). Provide your own dataset or obtain one from running their examples (if possible).
More tests can't hurt :)

NEWS.md

Include a NEWS file

bv2 commented 4 years ago

Thanks Marcel and Alan for your comments. We will work on the points in the review and certainly take a look at basilisk. This sounds interesting!

LTLA commented 4 years ago

I'll also point out that http://bioconductor.org/books/devel/OSCA/integrating-with-protein-abundance.html#integration-with-gene-expression-data has a section waiting for MOFA2, if it can accept SingleCellExperiment objects.

bv2 commented 4 years ago

Hi Marcel @LiNk-NY ,

thanks again for your time to carefully review our package. We went through all the points that you (as well as Aaron and Alan) raised above and updated our package to incorporate many of these comments in the newest commits (current version 0.99.10). Details on the points are listed below.

Should MOFA go under deprecation procedures? Yes as soon as MOFA2 is released MOFA should be deprecated 
Why not release MOFA version 2.0.0 instead? Yes, this could have been an option from the beginning. At the current stage, though, a new package seems more sensible to us.
As @Alanocallaghan said, consider using basilisk for possibly smoother integration to python. We looked into basilisk but it reduces the speed of the model training in our package. Therefore we do not yet include it but we will work on a workaround that provides at least the option to use basilisk if installation via pip and interface with reticulate do not succeed.  
We added support for SingleCellExperiments as requested by Aaron  
Consider inheriting from a standard Bioconductor representation (e.g., SummarizedExperiment, SingleCellExperiment, etc.). The main representation could benefit from having a more structured format including accessor functions. We provide wrappers to construct a MOFA object starting from these representations, however would like to keep our main MOFA class as is.

DESCRIPTION

Preferably use Authors@R field for more detailed authorship attributions as well as inclusion of ORCID identifiers. Done 
Include a BugReports field pointing to the GitHub issues location. Done

NAMESPACE

My preference is to avoid using names that are application specific and use or reuse a more general term (e.g., using sample_names vs colnames, features_names vs rownames, and factor_names vs levels) but this is optional. We stick to the current naming conventions as we do not want to confuse current users of MOFA or MOFA2.  
Where possible, reduce the function name length to more manageable names. We stick to the current naming conventions as we do not want to confuse current users of MOFA2.  
Look into the possible reuse of the plot generic if there are package classes involved in plot_. All plot functions are based on the MOFA class. *
Consider re-exporting the %>% if you intend your users to use the pipe as well. Done

vignette

Avoid using the @ accessor and use a more formalized interface with such as samples_metadata(model) or even metadata(model) if you inherit from the Annotated class. (instead of model@samples_metadata; applies to model@cache as well). Done
Update any commented code (Line#204) Done 
Keep the text within the 80 character width limit. Done 
Remove any HTML documents from the vignettes folder Done 

R
Use || to ensure that conditions on either side return a single logical value. Done 
(optional) Show method can be condensed based on the values of object@status and nfactors. We prefer a more detailed output here .
Remove any old / commented code (R/AllGenerics.R and others). Done 
Where possible, use vectorized operations (e.g., lapply(r2_mk, function(x) x100) could be unlist > 100 > relist operation Done 
Where possible, avoid nesting apply family loops (R/calculate_variance_explained.R) Not sure how to avoid these if not replaced by other nested structures. 
If models were a formalized S4 classf with validity checks, you wouldn't have to use "sanity checks" in compare_elbo and select_model As these two functions are for very advanced usage only and most users only work with a single object, we feel a simple list serves this purpose. 
Avoid using class(x) == "character". As seen with the matrix class, it is better to use the is.character function (R/correlate_covariates.R). Done 
Check that MultiAssayExperiment namespace is available (using requireNamespace) in .create_mofa_from_mae function otherwise return an error to the user and ask them to install the package. Done 
Setter functions (e.g., names(x)<-value) can be used to 'initialise' the MOFA object (rather than using @). Done where possible, some setter functions only work on trained objects. 
Provide users a way to turn the messages off (R/enrichment.R). Done 
Is there a reason for using doParrallel insted of BiocParallel? I do not know what the differences are. Done , this was deprecated and hence removed as dependency. 
Checking if the class .hasSlot(x, 'on_disk') should not be necessary,unless there are old serialized instances of this class. If so, you should have an updateObject method as well. Done 
Perhaps the make_example_data function (not exported) should be either in data-raw or inst/extdata / inst/scripts and saved as a serialized dataset in the package using usethis::use_data. This function is exported and allows users to change different dimensions of the model, therefore we would keep it. 
Because the package includes a lot of functions, consider separating the plotting and the infrastructure into separate packages.
We prefer having a single package. 
Provide additional details for the user in some functions (e.g., get_default_tranining_options) Ideally, the documentation should answer questions like 'What is the function for?', 'At what point in the process is it used?' etc. Done 
Avoid using semicolons ; and move the code to another line Done 
Where possible use character instead of numeric indexing (e.g., x[1] vs x["group"]) Done where possible 
.onAttach startup messages can become verbose if all packages are using them, avoid their use where possible. Done

Minor:

Use <- arrow for assignment instead of the = equals sign. Done 
length(x) > 1 is equal to length(x) when evaluated within an if conditional lenght(x) > 1 seems more intuitive/readbale to us
Avoid using isTRUE(x) on arguments that are logical, simply use if (x) (R/correlate_covariates.R). Done 
Condense message functions into one (R/enrichment.R). Done

/ pkg folder

Remove scripts template_script from this folder and possibly place them in inst/scripts. Done *
images should be in the vignettes folder for use Done , we removed them as they are not in use any more. 
I recommended that you create a separate repository for mofapy2 and only keep relevant R code in the repository. Yes, this makes sense and we are considering this for the next release. However, as we currently have ongoing developments that affect both the R and python code we would like to keep them joint until then (and thus still have the same structure as the MOFA package).
Remove the setup.py script. See above point, will be moved to a python code repository together with mofapy2  folder

tests

It's good that you have quite a few unit tests. The recommended way of adding tests is according to file name (e.g., test_enrichment.R for testing enrichment.R) but this is optional Done 
Avoid downloading resources from external websites such as GitHub these can change at any time (in test_creating_model.R). Provide your own dataset or obtain one from running their examples (if possible). Done 
More tests can't hurt :) We added some more

NEWS.md

Include a NEWS file Done

bv2 commented 4 years ago

@LiNk-NY Quick update: We also implemented usage of basilisk for integration with python (now version 0.99.12)

LiNk-NY commented 4 years ago

Hi Britta, @bv2 Thanks for making those changes and responding to the review. Consider adding mofapy2, setup.py and the Dockerfile to .Rbuildignore or moving them to the inst/scripts folder. I will accept your package. Best regards, Marcel

Note. Bump the package version and resolve any errors / warnings in the build report.

bioc-issue-bot commented 4 years ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

mtmorgan commented 4 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/bv2.keys is not empty), then no further steps are required. Otherwise, do the following:

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("MOFA2"). The package 'landing page' will be created at

https://bioconductor.org/packages/MOFA2

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

Bioconductor / Contributions

MOFA2 #1661

MOFA2 #1661

DESCRIPTION

NAMESPACE

vignette

R

/ pkg folder

tests

NEWS.md

DESCRIPTION

NAMESPACE

vignette

R

/ pkg folder

tests

NEWS.md