Closed davismcc closed 7 years ago
Hi @davismcc
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: slalom
Type: Package
Title: Factorial Latent Variable Modeling of Single-Cell RNA-Seq Data
Version: 0.99.0
Date: 2017-10-06
Author: Florian Buettner
Maintainer: Davis McCarthy <davis@ebi.ac.uk>
Authors@R: c(person("Florian", "Buettner", role=c("aut", "cre"),
email="buettner@ebi.ac.uk"), person("Naruemon",
"Pratanwanich", role=c("aut"), email="np394@ebi.ac.uk"),
person("Davis", "McCarthy", role=c("aut"), email="davis@ebi.ac.uk"),
person("John", "Marioni", role = c("aut"), email="marioni@ebi.ac.uk"),
person("Oliver", "Stegle", role = c("aut"), email="stegle@ebi.ac.uk"))
Description: slalom is a scalable modelling framework for single-cell RNA-seq
data that uses gene set annotations to dissect single-cell transcriptome
heterogeneity, thereby allowing to identify biological drivers of
cell-to-cell variability and model confounding factors.
Depends: R (>= 3.4)
Imports: Rcpp (>= 0.12.8), RcppArmadillo, BH, GSEABase, methods, rsvd,
SingleCellExperiment, stats
Suggests: knitr, rhdf5, testthat
LinkingTo: Rcpp, RcppArmadillo, BH
License: GPL (>= 2)
VignetteBuilder: knitr
LazyData: true
Encoding: UTF-8
biocViews: SingleCell, RNASeq, Normalization,
Visualization, DimensionReduction, Transcriptomics,
GeneExpression, Sequencing, Software, Reactome
RoxygenNote: 6.0.1
Your package has been approved for building. Your package is now submitted to our queue.
IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/slalom_buildreport_20171006164543.html
Hi Davis,
Thank you for your contribution to Bioconductor. It is a general a very nice package, with only some minor issues.
Generally looks good. Succinct and informative. Some suggestions:
model <- newSlalomModel(mesc, genesets, n_hidden = 5, min_genes = 10)
warning: Found more than one class "Rcpp_SlalomModel" in cache; using the first, from namespace 'slalom' Also defined by ‘.GlobalEnv’.Thanks, Qian
In addition to Qian's comments, we noticed that the current code creates two class definitions, in the package name space and in the global environment. I believe one should use the class name to generate the class; the use of module$SlalomModel creates the second class definition.
/slalom master$ git diff
diff --git a/R/SlalomModel-methods.R b/R/SlalomModel-methods.R
index e7543f3..21991e8 100644
--- a/R/SlalomModel-methods.R
+++ b/R/SlalomModel-methods.R
@@ -99,9 +99,8 @@ newSlalomModel <- function(
.newSlalom <- function(Y, pi_init, n_hidden, design = NULL) {
n_known <- ifelse(is.null(design), 0, ncol(design))
- slalom_module <- Rcpp::Module("SlalomModel", PACKAGE = "slalom")
out <- new(
- slalom_module$SlalomModel,
+ "Rcpp_SlalomModel",
Y_init = Y,
pi_init = pi_init,
X_init = matrix(0, nrow = nrow(Y), ncol(pi_init)),
I'm not an Rcpp expert, though...
Thanks, Qian and Martin for the thoughtful feedback. I have just committed some changes that deal with most of these issues:
Rcpp
docs and should have been using Rcpp::loadModule
instead of Rcpp::Module
- that seems to be behaving much better now.I'm planning to update the vignette with more details as you suggest (I'll add some plotting functions for results between now and Monday) and tidy up the R code too.
One final thing where your feedback would be appreciated: I've been vacillating on whether to have generic methods (init
, train
) to initialise and train the Rcpp_SlalomModel
objects, or just have regular functions (e.g. initSlalom
, trainSlalom
) that require the object argument to be a Rcpp_SlalomModel
. Do you have opinions about which approach is preferable? Generics need less typing from the user, but specific functions are, erm, more specific and avoid any possible NAMESPACE clashes. If you have opinions that will help me decide on mine!
Best, Davis
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/slalom_buildreport_20171012103001.html
Personally I like the initSlalom()
/ trainSlalom()
approach, because there are unlikely to be other 'init' and 'train' methods with the same contract as yours, and because simple names like init()
tend to lead to conflicts more often than name-mangled.
This means that the type checking performed by the method needs to be made explicit in the body of the plain-old-function.
I know others on the core team would argue differently, so you can pick your poison and know that your are satisfying some people (or dissatisfying others, depending on how you view the cup).
Hi Davis,
Thank for your quick response of the review. Here is the summary of your updates:
Best, Qian
Thanks, both, especially your comments on the function naming, Martin. I have come to accept that it's impossible to please everyone. I was leaning towards the consistency (e.g. with newSlalomModel
) and relative simplicity of initSlalom()
/trainSlalom()
, so I'll go with that.
I'll update ASAP over the next couple of days.
Hi both,
I've added some plotting functions, as well as a pre-training method for the model and fixed a handful of bugs. I realise today's the feature freeze, but if I can (be permitted to) I'll add a method tomorrow to write add results from slalom
to a SingleCellExperiment
object.
I'll also add more to the vignette and check over the docs - I assume I still have a few days to sort out those things?
Best, Davis
One adjacent query that I couldn't figure out today: gene set names from REACTOME and the like can be very long, so I'd like to be able to show users in the vignette how to truncate these (e.g. to 30 characters). However, trying the following gave a large number of warnings:
gmtfile <- system.file("extdata", "reactome_subset.gmt", package = "slalom")
genesets <- GSEABase::getGmt(gmtfile)
for (i in seq_along(genesets)) {
GSEABase::setName(genesets[[i]]) <- gsub("REACTOME_", "",
GSEABase::setName(genesets[[i]]))
GSEABase::setName(genesets[[i]]) <- strtrim(GSEABase::setName(genesets[[i]]), 30)
}
I couldn't figure out from the GSEABase
docs how to do this happily. Perhaps you know?
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/slalom_buildreport_20171016202011.html
Received a valid push; starting a build. Commits are:
6f67d21 Fixing warnings from BiocCheck
Apologies, there were a couple of sloppy things that caused warnings from BiocCheck
. I've fixed these and don't get any errors or warnings from BiocCheck
now.
The bigger issue with the errors on the Linux and Windows builds is due to my tests (which I've developed on Mac OS X). I specifically test the model output for agreement with different starting conditions. It seems that the Rcpp
code compiles differently on Linux and Window, so yields subtly different convergence behaviour. As a result, some of my strict tests fail on these platforms.
Do you have any suggestions for handling this sort of thing? Obviously, I could weaken/reduce tests, but that doesn't seem ideal!
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/slalom_buildreport_20171017055913.html
OK, things are looking happier now :)
HI Davis,
Thank you for making changes to your functions, classes and tests. These will make your package more robust. However, seems that you wasn't able to add some more explanation about the outputs or plotting functions to your vignette. As I know, if your package could not get accepted by today, it will go to the next release.
About your question of GSEABase, I could reproduce your warnings but found this to work without warnings:
gmtfile <- system.file("extdata", "reactome_subset.gmt", package = "slalom") genesets <- GSEABase::getGmt(gmtfile) nchar(unlist(lapply(genesets@.Data, GSEABase::setName)))
gs1 <- GSEABase::GeneSetCollection( lapply( genesets, function(x){ GSEABase::setName(x) <- gsub("REACTOME_", "", GSEABase::setName(x)) x } ) ) nchar(unlist(lapply(gs1@.Data, GSEABase::setName)))
@mtmorgan may have some better idea about this question.
Best, Qian
Hi Qian,
Thanks for the update. I didnt realise that the vignette issues were crucial to making it into the release. Nevertheless, I am shortly (I mean with the next 20 mins) about to commit updates to the vignette that include much more discussion of model output and plots.
I really hope that you'll be able to accept the package after my next commit. I've put a lot of effort into getting it into this release so it would be a real shame to miss it by such a small margin.
I did manage to solve the GeneSetCollection with lapply()
in the meantime.
Expect the updated vignette very, very shortly.
Best, Davis
OK, it's all there! I sincerely hope you can accept the package into this release!
D.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/slalom_buildreport_20171017145456.html
Alright, that's all looking good! Do you need anything more from me for acceptance of the package? (Starting to get late here in UK).
Best, Davis
Hi Davis,
I just confirmed with the bioconductor core team and your package should still be accepted and make the release (if @mtmorgan goes over it, at the soonest tomorrow, and thought it could be accepted). Sorry for the wrong information I gave in last message. The current release of Bioconductor 3.5 is frozen today. So you still have chance to go to the current devel 3.6 and be in the new release after oct. 31.
In the meantime, could you also give a look at this NOTE message? this could be easily fixed by adding a zzz.R
, utils::globalVariables(n_gain, term, ...)
. It's not mandatory though.
* checking R code for possible problems ... [14s] NOTE
plotRelevance: no visible binding for global variable 'n_gain'
plotRelevance: no visible binding for global variable 'term'
plotRelevance: no visible binding for global variable 'n_loss'
Best, Qian
The vignette is more comprehensive and prettier with figures and explanations about each step. Just one thing:
sessionInfo()
in the end for your R environment. Once this done, it will be all good to me. @mtmorgan
Best, Qian
Thanks, Qian. I've fixed the NOTE and added sessionInfo() to the vignette in the latest commit. Fingers crossed for Martin's review and acceptance.
Best, Davis
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/slalom_buildreport_20171017175226.html
Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be sent to the maintainer email address in the next several days.
Thank you for contributing to Bioconductor!
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/davismcc.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite(\"slalom\")
. The package 'landing page' will be created at
https://bioconductor.org/packages/slalom
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
[NB: I am submitting this package under my account, though it is currently hosted through a GitHub account belonging to the Stegle Group and EMBL-EBI]
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed. [NB: there is currently one warning, which we believe to be an issue with GSEABase as reported on Bioc-devel mailing list in the last 24 hours or so]
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.