SPONGE package for inferring genome-wide competing endogenous RNA networks

mlist commented 7 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/mlist/SPONGE

Confirm the following by editing each check box to '[x]'

[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

I am familiar with the essential aspects of Bioconductor software management, including:

[x] The 'devel' branch for new packages and features.
[x] The stable 'release' branch, made available every six months, for bug fixes.
[x] Bioconductor version control using Subversion (optionally via GitHub).

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 7 years ago

Hi @mlist

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: SPONGE
Type: Package
Title: Sparse Partial Correlations On Gene Expression
Version: 0.99.0
Author: Markus List, Marcel Schulz
Maintainer: Markus List <markus.list@mpi-inf.mpg.de>
Description: This package provides methods to efficiently detect competitive endogeneous RNA interactions between two genes. Such interactions are mediated by one or several miRNAs such that both gene and miRNA expression data for a larger number of samples is needed as input.
License: GPL (>=3) 
LazyData: TRUE
RoxygenNote: 6.0.1
Depends: 
    R (>= 3.4)
Suggests:
    testthat,
    knitr,
    rmarkdown
Imports:
    stats,
    ggplot2,
    ggrepel,
    gridExtra,
    ppcor,
    logging,
    foreach,
    dplyr,
    data.table,
    visNetwork,
    bigmemory,
    tidyr,
    digest,
    MASS,
    expm,
    gRbase,
    glmnet,
    igraph,
    infotheo,
    iterators,
    itertools,
    matrixStats,
    mirbase.db
VignetteBuilder: knitr
biocViews: GeneExpression,
    Transcription,
    GeneRegulation,
    NetworkInference,
    Transcriptomics,
    SystemsBiology,
    Regression

bioc-issue-bot commented 7 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

mtmorgan commented 7 years ago

I'll provide a more complete review in a few days.

Please modify your package so that it interacts smoothly with the Bioconductor ecosystem, in particular accepting Biobase::ExpressionSet as a resource for gene expression values.

Carefully consider the number of packages listed in your DESCRIPTION file and used in your code -- trim only marginally useful functionality so that the dependencies are reduces and your package less susceptible to breakage when upstream changes occur.

Consider using BiocParallel so that Bioconductor users have a common parallel experience; see it's vignette for developer guidelines.

mlist commented 7 years ago

Thanks for the quick response. I'll add support for Biobase:ExpressionSet right away and I will also see if I can remove a few packages from DESCRIPTION.

With respect to BiocParallel: I was looking into this. I use foreach for parallelization and mention in the vignette that this is compatible with BiocParallel as long as DoparParam is used. Should I handle this differently?

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/SPONGE_buildreport_20170829045931.html

mtmorgan commented 7 years ago

Please modify your package in response to these and the original comments, and provide a brief response comment when ready for the review to continue.

General

With respect to BiocParallel -- I personally find the extensive use of foreach() and data.table() makes your code difficult to parse, but I realize that this is my limitation.
I strongly think that the end user should not be exposed to data.table (because of the awkward syntax and pass-by-reference semantics).
The use of foreach() with %do% seems really (to me) to just obscure basic iteration; is it really adding value?

DESCRIPTION / NAMESPACE

Please simplify the dependencies

vignettes

please remove build products (SPONGE.html) from your master branch
Is it neccessary to present verbose loggin information? This will not be useful to the individual trying to follow your package.

R

generally the code seems to be well structured and appropriate.

fn_simulation.R:213 I doubt that the early exit of

i <- 1
within_constraints <- TRUE
while(i < length(u2.scaled)){
    if((u2.scaled[i] > constraints[i]) ||
       (u2.scaled[i] < -constraints[i])) {
        within_constraints <- FALSE
        break
    }
    i <- i+1
}

gains any speed relative to (did I get the abs() right?)

within_constraints <- any(abs(u2.scaled) > constraints)

1:n specification of domains for iteration are problematic when n == 0; use seq_len(n) or seq_along() instead. I don't know how this plays with foreach().

NEWS

hmm, looks like a place-holder; write and maintain so parsable by tools::news() or omit.

tests

nice to see a stab at tests, but really these need to be much more extnesive and focused on 'units' of computation, rather than at the integrative level of sponge(), etc.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

0231e91 version bump for bioconductor build system

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/SPONGE_buildreport_20170907164152.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

5ea5234 travis build is successful. version bump to trigge...

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/SPONGE_buildreport_20170908050520.html

mlist commented 7 years ago

General

Please modify your package so that it interacts smoothly with the Bioconductor ecosystem, in particular accepting Biobase::ExpressionSet as a resource for gene expression values.

All functions now accept ExpressionSet as gene or miRNA expression data.

With respect to BiocParallel -- I personally find the extensive use of foreach() and data.table() makes your code difficult to parse, but I realize that this is my limitation.

We deliberately chose to work with foreach and data.table for performance reasons and would like to stick to it, also considering that the use of foreach is officially supported by BiocParallel. We explain in the vignette how to register the parallel backend of foreach with BiocParallel.

I strongly think that the end user should not be exposed to data.table (because of the awkward syntax and pass-by-reference semantics).

The new default behaviour of sponge is to return a data frame but we keep the option to return a data table via a parameter return_as_dt.

The use of foreach() with %do% seems really (to me) to just obscure basic iteration; is it really adding value?

We see two advantages: (i) consistency of looping which allows you to switch to parallel execution where needed; (ii) foreach is highly configurable and allows specifying a function for combining results, for instance.

DESCRIPTION / NAMESPACE

Please simplify the dependencies

We have removed some dependencies and turned others into suggestions.

vignettes

please remove build products (SPONGE.html) from your master branch

Done.

Is it neccessary to present verbose loggin information? This will not be useful to the individual trying to follow your package.

We have changed the default logging level to "ERROR" to avoid verbose logging information.

R

generally the code seems to be well structured and appropriate.

Thank you.

fn_simulation.R:213 I doubt that the early exit of

i <- 1
within_constraints <- TRUE
while(i < length(u2.scaled)){
    if((u2.scaled[i] > constraints[i]) ||
       (u2.scaled[i] < -constraints[i])) {
        within_constraints <- FALSE
        break
    }
    i <- i+1
}

gains any speed relative to (did I get the abs() right?)

within_constraints <- any(abs(u2.scaled) > constraints)

We fully agree and have changed the code accordingly.

1:n specification of domains for iteration are problematic when n == 0; use seq_len(n) or seq_along() instead. I don't know how this plays with foreach().

Good point. We have changed this as well.

NEWS

hmm, looks like a place-holder; write and maintain so parsable by tools::news() or omit. tests

tools::checkNEWS seems to be defunct. We have tried to follow the example of Biobase for changing the NEWS file.

nice to see a stab at tests, but really these need to be much more extnesive and focused on 'units' of computation, rather than at the integrative level of sponge(), etc.

We agree and have implemented a full suite of unit tests now.

Looking forward to your feedback on the improved version.

mtmorgan commented 7 years ago

Thanks these changes look great.

bioc-issue-bot commented 7 years ago

Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be sent to the maintainer email address in the next several days.

Thank you for contributing to Bioconductor!

mtmorgan commented 7 years ago

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite("YOUR_PACKAGE_NAME"). The package 'landing page' will be created at

https://bioconductor.org/packages/YOUR_PACKAGE_NAME

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

Bioconductor / Contributions

SPONGE package for inferring genome-wide competing endogenous RNA networks #461