Closed khazum closed 1 year ago
Hi @khazum
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: ccImpute
Type: Package
Title: ccImpute: an accurate and scalable consensus clustering based approach to
impute dropout events in the single-cell RNA-seq data
Version: 0.99.0
Authors@R: person("Marcin", "Malec",
email = "mamalec@iu.edu",
role = c("cre", "aut"),
comment = c(ORCID = "0000-0002-2354-513X"))
Description: Dropout events make the lowly expressed genes indistinguishable
from true zero expression and different than the low expression present in
cells of the same type. This issue makes any subsequent downstream analysis
difficult. ccImpute is an imputation algorithm that uses cell similarity
established by consensus clustering to impute the most probable dropout
events in the scRNA-seq datasets. ccImpute demonstrated performance which
exceeds the performance of existing imputation approaches while introducing
the least amount of new noise as measured by clustering performance
characteristics on datasets with known cell identities.
License: GPL-3
Imports: Rcpp, doParallel, foreach, matrixStats, parallel, stats, SIMLR
LinkingTo: Rcpp, RcppArmadillo
Encoding: UTF-8
LazyData: false
RoxygenNote: 7.1.2
biocViews: SingleCell, PrincipalComponent, DimensionReduction,
Clustering, RNASeq, Transcriptomics
biocType: Software
Suggests:
knitr, rmarkdown, BiocStyle, sessioninfo, scRNAseq, scater,
SingleCellExperiment, mclust
VignetteBuilder: knitr
on a mac that can install just about anything:
* installing *source* package 'ccImpute' ...
** using staged installation
** libs
g++ -std=gnu++11 -I"/Users/vincentcarey/R-dev-dist/lib/R/include" -DNDEBUG -I'/Users/vincentcarey/R-dev-dist/lib/R/library/Rcpp/include' -I'/Users/vincentcarey/R-dev-dist/lib/R/library/RcppArmadillo/include' -I/usr/local/include -fPIC -g -O2 -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++11 -I"/Users/vincentcarey/R-dev-dist/lib/R/include" -DNDEBUG -I'/Users/vincentcarey/R-dev-dist/lib/R/library/Rcpp/include' -I'/Users/vincentcarey/R-dev-dist/lib/R/library/RcppArmadillo/include' -I/usr/local/include -fPIC -g -O2 -c cppFunctions.cpp -o cppFunctions.o
cppFunctions.cpp:4:10: fatal error: 'omp.h' file not found
#include <omp.h>
^~~~~~~
Please be sure you are configuring in a portable way.
This issue has been addressed. Please proceed with the evaluation.
A reviewer has been assigned to your package. Learn what to expect during the review process.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
I believe the warning has to do with the build system rather than my code. I hope the package review will continue smoothly. Please let me know if I can be of any further assistance.
The warning in the macOS builder looks like something that needs to be addressed in your code: https://bioconductor.org/spb_reports/ccImpute_buildreport_20220429124256.html#merida1_check_anchor I'll see if it reproduces on my Mac. Are you able to install and test the package on Mac?
Can confirm that I get the warning when checking the package on my MacBook Pro (Early-2015).
# Trimmed output of running `R CMD check ccImpute_0.99.0.tar.gz`
* checking compiled code ... WARNING
File ‘ccImpute/libs/ccImpute.so’:
Found ‘___assert_rtn’, possibly from ‘assert’ (C)
Object: ‘RcppExports.o’
Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs.
> sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.3.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0 fortunes_1.5-4
I'm not sure of the solution, but I think this indicates it's something you will need to fix within ccImpute.
I have researched this issue further, and I made appropriate changes to the source code that should take care of this warning. I believe the issue is addressed now.
Please remember to push to git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
Received a valid push on git.bioconductor.org; starting a build for commit id: b2f6a310f2106fcd37d2d0c8bf21df8ba9ba659c
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: fbfe50bddc7097d3f5d34383fa309edc915e1a48
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 65f93b013ee29d84c83390c91ea78e571210f240
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 917acc2d2462ebd5f6a3f4892d86a655b809054b
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 9f14388a3e1a8ff4860542c9af9f43bb471c25e7
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: cabce0e55796a37a1a894155047d2b6fdda17a38
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: b619983fee81984a3b1ff8a547160df608737ac7
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 1baead3e10f1c0ceaf9afa193ce44c40fc2d10b8
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: b10e0c5992329c097c0f1c40298385a334cf5d19
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 5f750bcd30cadc061eee218ca72977bc50455315
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: a9758b93b8711bdb203e601aaab9809c153db36b
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Thanks, @khazum, now that we have a clean build I will begin reviewing ccImpute. I will try to post an initial review within the next week. Please note that I will be on leave May 25 - June 8 and will be without internet access during that time.
Hi @khazum,
Thank you for submitting ccImpute to Bioconductor. Overall, the package is in good shape with clear code. For acceptance into Bioconductor, I have a few Required points, as well as some Recommended points, that I would ask you to first please address. Please add your point-by-point replies to the review in this thread and let me know if anything is unclear.
Is there a preprint or paper describing the method?
One question I had after running the vignette is that I observed that ccImpute::impute()
doesn't change the non-zero values and makes ~5% of zero values non-zero.
Is that expected and typical results?
suppressPackageStartupMessages(library(SingleCellExperiment))
suppressPackageStartupMessages(library(scRNAseq))
#> Warning: replacing previous import 'utils::download.file' by
#> 'restfulr::download.file' when loading 'rtracklayer'
suppressPackageStartupMessages(library(scuttle))
suppressPackageStartupMessages(library(ccImpute))
sce <- DarmanisBrainData()
#> snapshotDate(): 2022-05-17
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
sce <- logNormCounts(sce)
k <- length(unique(sce$cell.type))
assay(sce, "imputed") <- impute(logcounts(sce), k = k, nCores = 2)
#> Running ccImpute on dataset with 466 cells.
#> Imputation finished.
zero <- logcounts(sce) == 0
all.equal(logcounts(sce)[!zero], assay(sce, "imputed")[!zero])
#> [1] TRUE
proportions(table(assay(sce, "imputed")[zero] > 0))
#>
#> FALSE TRUE
#> 0.94864765 0.05135235
Created on 2022-06-15 by the reprex package (v2.0.1)
Cheers, Pete
set.seed()
to ensure reproducibility of results. In particular, cluster labels from k-means clustering may not be identical across repeated runs, although my understanding is that clusters will contain the same cells across repeated runs.assay(sce, "imputed") <- impute(logcounts(sce), k = k, nCores = 2)
instead of logcounts(sce) <- impute(assays(sce)$logcounts, k = k, nCores = 2)
.reducedDim
assay to store the PCA of the imputed data, i.e. reducedDim(sce, "PCA_of_imputed") <- prcomp(t(assay(sce, "imputed")))$x
or sce <- scran::fixedPCA(sce, exprs_values = "imputed", rank = ncol(sce), ntop = nrow(sce), scale = FALSE, BSPARAM = BiocSingular::ExactParam(), name = "PCA_of_imputed")
instead of reducedDims(sce) <- list(PCA=prcomp(t(assay(sce, "imputed")))$x)
.impute()
function needs to support inputs other than matrix objects. In particular, log-transformed normalized counts are routinely stored as sparse matrices (generally as a dgCMatrix object from the Matrix package; see example below), so it needs to support these. The sparseMatrixStats may help because it extends the matrixStats API to sparse matrices. I recognise that (at least in their current form) the RcppEigen-based calculations may require dense matrices, so hacky workaround would be to just call as.matrix()
on the input, although that will of course convert a sparse matrix into a dense matrix and lose the efficiencies of sparse storage and computation.
# ccImpute::impute() cannot handle sparse matrices.
suppressPackageStartupMessages(library(SingleCellExperiment))
suppressPackageStartupMessages(library(scRNAseq))
#> Warning: replacing previous import 'utils::download.file' by
#> 'restfulr::download.file' when loading 'rtracklayer'
suppressPackageStartupMessages(library(scuttle))
suppressPackageStartupMessages(library(ccImpute))
# Using a different example dataset.
sce <- BacherTCellData()
#> snapshotDate(): 2022-05-17
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
# Raw counts are stored as sparse matrix (dgCMatrix).
class(counts(sce))
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
# Consequently, logcounts are also stored as a sparse matrix (dgCMatrix)
# because by default logNormCounts() preserves sparsity.
sce <- logNormCounts(sce)
class(logcounts(sce))
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"
# Which causes impute() to fail.
impute(logcounts(sce), k = length(sce$seurat_clusters), nCores = 2)
#> Running ccImpute on dataset with 104417 cells.
#> Error in rowVars(logX): Argument 'x' must be a matrix or a vector.
Created on 2022-06-15 by the reprex package (v2.0.1)
NOTES
produced by running BiocCheck::BiocCheck()
on the package. In particular, I would expect these 2 issues be addressed:
=
for assignment and use <-
instead.useRanks
, k
, consMin
, kmMax
) to the documentation of impute()
. These parameters are described in the documentaiton but it's not clear to me what 'good' values of these might be.Rcpp::checkUserInterrupt()
every X iterations (see §1.4 of https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-attributes.pdf).data
, as you do in the vignette, because it clashes with the utils::data()
function and it's not very descriptive. A rough naming convention for a SingleCellExperiment object in documentation is to use sce
.scran::clusterCells()
instead of stats::kmeans()
. I'll leave the decision to you because there is a trade off here: # R session information.
)stats::prcomp()
) or could the code benefit from approximate PCA methods, such as those available from BiocSingular (e.g., BiocSingular::runPCA()
and see help(topic = "BiocSingularParam-class", package = "BiocSingular")
). I suspect the use of approximate methods will lead to substantial speed-ups with neglible differences in the results (as well as enabling the use of sparse-matrix-aware methods).rank
argument of scran::fixedPCA()
.impute()
. If the user supplied a SingleCellExperiment object, then impute()
would return an updated SingleCellExperiment object with the imputed data stored as an approriately named additional assay. If the user supplied a matrix-like object, then it would just return a similar matrix-like object. E.g., see functions like scuttle::normalizeCounts()
and scuttle::logNormCounts()
, which support 'A numeric matrix-like object [e.g., matrix, dgCMatrix, and even more exotic ones like DelayedMatrix and HDF5Matrix]' or 'alternatively a _SingleCellExperiment or SummarizedExperiment object'.LazyData
in the DESCRIPTION
should be FALSE
(uppercase).BugReports
field to the DESCRIPTION
; see https://contributions.bioconductor.org/description.html#description-bugreportinst/CITATION
file; see https://contributions.bioconductor.org/citation.html?q=citation#citation@keywords internal
to the roxygen2 documentation; see https://r-pkgs.org/man.html?q=keywo#roxygen-commentskmeans()
because there already exists stats::kmeans()
.impute()
to ccImpute()
because impute()
is a bit general and other packages define function with same name but different interface.Closing this for now. @khazum Please re-open the issue if and when you would like to proceed with the submission.
This issue is being closed because there has been no progress for an extended period of time. You may reopen the issue when you have the time to actively participate in the review / submission process. Please also keep in mind that a package accepted to Bioconductor requires a commitment on your part to ongoing maintenance.
Thank you for your interest in Bioconductor.
Thank you, Pete, for your great feedback!
Is there a preprint or paper describing the method? Reply: Yes, the paper was published very recently, and I have included the references in the package.
One question I had after running the vignette is that I observed that ccImpute::impute() doesn't change the non-zero values and makes ~5% of zero values non-zero. Is that expected and typical results? Reply: Yes, they are. ccImpute focuses only on imputing only the most probable dropout events.
Please set.seed() to ensure reproducibility of results. In particular, cluster labels from k-means clustering may not be identical across repeated runs, although my understanding is that clusters will contain the same cells across repeated runs. Reply: Done.
Don't overwrite the existing logcounts assay. Instead, create a new assay to store the imputed data, i.e. assay(sce, "imputed") <- impute(logcounts(sce), k = k, nCores = 2) instead of logcounts(sce) <- impute(assays(sce)$logcounts, k = k, nCores = 2). Don't overwrite the existing PCA. Instead create a new reducedDim assay to store the PCA of the imputed data, i.e. reducedDim(sce, "PCA_of_imputed") <- prcomp(t(assay(sce, "imputed")))$x or sce <- scran::fixedPCA(sce, exprs_values = "imputed", rank = ncol(sce), ntop = nrow(sce), scale = FALSE, BSPARAM = BiocSingular::ExactParam(), name = "PCA_of_imputed") instead of reducedDims(sce) <- list(PCA=prcomp(t(assay(sce, "imputed")))$x). Reply: Done.
Please add an 'Installation' section to the vignette; see https://contributions.bioconductor.org/docs.html?q=vignette#installation. You may also wish to add this to the vignette. Reply: Done.
The impute() function needs to support inputs other than matrix objects. In particular, log-transformed normalized counts are routinely stored as sparse matrices (generally as a dgCMatrix object from the Matrix package; see example below), so it needs to support these. The sparseMatrixStats may help because it extends the matrixStats API to sparse matrices. I recognise that (at least in their current form) the RcppEigen-based calculations may require dense matrices, so hacky workaround would be to just call as.matrix() on the input, although that will, of course, convert a sparse matrix into a dense matrix and lose the efficiencies of sparse storage and computation. As an extension, you may want to consider support for DelayedMatrix objects (and their derivatives, like HDF5Matrix); the DelayedMatrixStats package may help because it extends the matrixStats API to DelayedMatrix objects. Again, the RcppEigen-based calculation would likely require updating to be compatible. Reply: I am settling for the hacky workaround (call as.matrix()) for the time being. I plan to incorporate better support for these matrix representations with the approximate version of ccImpute that I hope to address in future releases.
It is common in the analysis of scRNA-seq data to perform feature selection prior to dimensionality reduction (e.g., selecting only the most 'highly variable' genes for input to dimensionality reduction; see https://bioconductor.org/books/3.15/OSCA.basic/feature-selection.html and https://bioconductor.org/books/3.15/OSCA.basic/dimensionality-reduction.html). Do you recommend that when using the imputed data that a feature selections be performed and, if so, how do you recommend this is done? Please provide guidance in the vignette or other documentation. Reply: Added guidance to the vignette.
Where's the evidence that ccImpute is better than existing imputation methods (as claimed in README and vignette)? Is there a preprint or paper? Please provide a brief summary of the evidence in the package documentation. Reply: Done.
Please take all reasonable steps to address the NOTES produced by running BiocCheck::BiocCheck() on the package. In particular, I would expect these 2 issues be addressed: Avoid using = for assignment and use <- instead. Consider adding unit tests. We strongly encourage them. See https://contributions.bioconductor.org/tests.html Reply: Done.
Please add some guidance on the choice of some parameters (e.g., useRanks, k, consMin, kmMax) to the documentation of impute(). These parameters are described in the documentaiton but it's not clear to me what 'good' values of these might be. Reply: Done.
Please use BiocParallel for R-level parallelisation instead of parallel, doParallel, and foreach; see https://contributions.bioconductor.org/r-code.html?q=par#parallel-recommendations Reply: Done.
Please support user interruption of the Rcpp functionality. I think you can achieve this by adding Rcpp::checkUserInterrupt() every X iterations (see §1.4 of https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-attributes.pdf). Reply: The Rcpp compute intensive code uses openMP and it appears that it is not compatible with Rcpp::checkUserInterrupt().
Recommended improvements to vignette:
Vignette should use proper rmarkdown citations rather than plain text; see https://raw.githubusercontent.com/rstudio/cheatsheets/main/rmarkdown.pdf or https://bookdown.org/yihui/rmarkdown-cookbook/bibliography.html Reply: Done.
Generally best to avoid calling an object data, as you do in the vignette, because it clashes with the utils::data() function and it's not very descriptive. A rough naming convention for a SingleCellExperiment object in documentation is to use sce. Reply: Done.
Consider using packages and functions from the 'Orchestrating Single Cell Analysing with Bioconductor' workflow (https://osca.bioconductor.org/) rather than base functionality, e.g., scran::clusterCells() instead of stats::kmeans(). I'll leave the decision to you because there is a trade off here: On the one hand, it is likely that a user of your package will already be familiar with these packages and functions, which potentially reduces the cognitive load. Switching to these routines also highlights different options available to someone using ccImpute in their scRNA-seq analysis workflow, e.g., using approximate PCA methods () or using clustering methods other than k-means such as those described in https://bioconductor.org/books/3.15/OSCA.basic/clustering.html). On the other hand, some these functions are wrappers around base R functionality, so there's an argument for using it directly. Reply: Considered.
Please fix the formatting of the header for the 'Session info' section (I think it's just missing a whitespace and should avoid the backticks, i.e. # R session information.) Reply: Done.
Some potential improvements to PCA-based calculations (both in the package code and the vignette): Do you really require 'exact PCA' (as you are doing with calls to stats::prcomp()) or could the code benefit from approximate PCA methods, such as those available from BiocSingular (e.g., BiocSingular::runPCA() and see help(topic = "BiocSingularParam-class", package = "BiocSingular")). I suspect the use of approximate methods will lead to substantial speed-ups with neglible differences in the results (as well as enabling the use of sparse-matrix-aware methods). Do you really require all PCs or could you use the top-X (e.g., top-50) PCs? See for example the rank argument of scran::fixedPCA(). Reply: This is a great suggestion, but currently, there is no good way to predict the rank parameter (number of PCs) and I rely on accurate SD values. I hope to incorporate this in future releases.
The vignette is fairly terse. Please think of a new user reading it, e.g., some questions I imagined a new user might have: Why is looking at ARI helpful in assessing if imputation has improved the results (and what is it ARI)? What is run time (will it even work?) on a contemporary data set (e.g., n > 10,000 cells)? Reply: Done.
Recommend adding direct support for SingleCellExperiment objects to impute(). If the user supplied a SingleCellExperiment object, then impute() would return an updated SingleCellExperiment object with the imputed data stored as an appropriately named additional assay. If the user supplied a matrix-like object, then it would just return a similar matrix-like object. E.g., see functions like scuttle::normalizeCounts() and scuttle::logNormCounts(), which support 'A numeric matrix-like object [e.g., matrix, dgCMatrix, and even more exotic ones like DelayedMatrix and HDF5Matrix]' or 'alternatively a _SingleCellExperiment or SummarizedExperiment object'. Reply: I will address this in future releases.
The value of LazyData in the DESCRIPTION should be FALSE (uppercase). Reply: Done.
Please add a BugReports field to the DESCRIPTION; see https://contributions.bioconductor.org/description.html#description-bugreport Reply: Done.
Strongly recommend adding a inst/CITATION file; see https://contributions.bioconductor.org/citation.html?q=citation#citation You've documented but not exported a few functions. Reply: Done.
You may wish to add @keywords internal to the roxygen2 documentation; see https://r-pkgs.org/man.html?q=keywo#roxygen-comments Reply: Done.
Consider renaming the internal function kmeans() because there already exists stats::kmeans(). Reply: Done.
Consider renaming impute() to ccImpute() because impute() is a bit general and other packages define function with same name but different interface. Reply: Done.
Recommend adding a package-level man page; see https://contributions.bioconductor.org/docs.html#package-level-documentation. Reply: I will address this in future releases.
Consider adding runable examples to man pages for internal functions. Reply: I will address this in future releases.
@PeteHaitch Can you please reopen this issue? Thank you.
Thank you for responding to the review, @khazum I'm a little behind on my reviews but I will try to finalise the review before I go on leave next week.
Dear @khazum ,
We have reopened the issue to continue the review process. Please remember to push a version bump to git.bioconductor.org to trigger a new build.
Would you please push the changes to git.bioconductor.org
to trigger a build of the updated package.
Thanks!
Received a valid push on git.bioconductor.org; starting a build for commit id: 50c29681fc791fd120e4107dc142d3a790ca77e8
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: c18e38e5c774f3f705daa9027f17fc4cc21e2429
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 4855aa7dcf4e0a32ddb263a23801a132e88fcae0
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 1274430b7de6c734d1466b401f2e8f592253d1f8
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/ccImpute
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.