Closed PeteHaitch closed 7 years ago
Hi @PeteHaitch
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: DelayedMatrixStats
Type: Package
Title: Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects
Version: 0.99.0
Authors@R: c(person("Peter", "Hickey", role = c("aut", "cre"),
email = "peter.hickey@gmail.com"))
Author: Peter Hickey <peter.hickey@gmail.com>
Maintainer: Peter Hickey <peter.hickey@gmail.com>
Description: A port of the 'matrixStats' API for use with DelayedMatrix objects
from the 'DelayedArray' package. High-performing functions operating on rows
and columns of DelayedMatrix objects, e.g. col / rowMedians(),
col / rowRanks(), and col / rowSds(). Functions optimized per data type and
for subsetted calculations such that both memory usage and processing time is
minimized.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 6.0.1
Depends:
DelayedArray
Imports:
methods,
matrixStats (>= 0.52.2-9000),
Matrix,
S4Vectors,
IRanges
Suggests:
testthat,
HDF5Array,
knitr,
rmarkdown,
covr,
BiocStyle,
microbenchmark,
profmem
VignetteBuilder: knitr
URL: https://github.com/PeteHaitch/DelayedMatrixStats
BugReports: https://github.com/PeteHaitch/DelayedMatrixStats/issues
Remotes: HenrikBengtsson/matrixStats@develop
biocViews: Infrastructure, DataRepresentation, Software
Ah, bugger. I just remembered I'm depending on a version of matrixStats that isn't yet available on CRAN. The fix I'm relying on is https://github.com/HenrikBengtsson/matrixStats/issues/110
I'll take another look at this tonight to see if I can get it working with the released version of matrixStats.
The next CRAN release of matrixStats likely won't come out in time for the next BioC release (https://github.com/HenrikBengtsson/matrixStats/issues/114), so I'll need to workaround https://github.com/HenrikBengtsson/matrixStats/issues/110. It's a little ugly because it will involve a change to the order of the arguments in a generic (albeit a rarely used generic/method), so in the initial BioC release of DelayedMatrixStats the arguments would be the older, 'wrong' order and then a patch applied to DelayedMatrixStats to update to the newer, 'correct' order once CRAN matrixStats is updated to include the patch.
Your package has been approved for building. Your package is now submitted to our queue.
IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005171138.html
Since this package interacts so closely with the DelayedArray package, I have some issues I would appreciate some advice on. See https://github.com/PeteHaitch/DelayedMatrixStats/labels/initial_release for the ones I think are pertinent for the initial release. Thanks!
I wonder if some or all of these could just be folded into the DelayedArray package itself? Seems problematic to have a colSums()
and colSums2()
.
I appreciate that concern. matrixStats::colSums2()
has extra functionality over base::colSums()
.
library(matrixStats)
args(colSums)
#> function (x, na.rm = FALSE, dims = 1L)
#> NULL
args(colSums2)
#> function (x, rows = NULL, cols = NULL, na.rm = FALSE, dim. = dim(x),
#> ...)
#> NULL
The main benefit of colSums2()
/rowSums2()
/colMeans2()
/rowMeans2()
is the capability to compute these summaries over subsets of the data while avoiding a copy of x
. While this behaviour may not be achievable for all types of DelayedMatrix objects, I'd like to retain the same API as matrixStats wherever possible.
But what's the point of those arguments when you can just lazily subset the object?
But what's the point of those arguments when you can just lazily subset the object?
Yes, that's what I mean by "this behaviour may not be achievable for all types of DelayedMatrix objects". However, by at least offering colSums2()
the same code will work for matrix and DelayedMatrix objects.
I can sort of see the desire for compatibility with matrixStats. But why not give colSums()
the same treatment as colSums2()
? I think we really want people calling the standard function.
But why not give colSums() the same treatment as colSums2()? I think we really want people calling the standard function.
So from my perspective the ultimate would be to have base::colSums()
have the capabilities of matrixStats::colSums2()
, but I suspect that's not what you mean :) And to clarify, colSums()
does already exist in DelayedArray (and uses the block-processing strategy), but, yes, some of the tricks used in colSums2()
could be brought over (e.g., 'seed-aware' methods).
Re your earlier comment,
I wonder if some or all of these could just be folded into the DelayedArray package itself?
Again, ultimately, that may be the right thing to do. Right now, at least for me, decoupling the two makes development a little easier. I'd certainly be willing to work with @hpages to fold the entire package into DelayedArray if it was decided this was the best way forward.
Ok, cool, it would be great if DelayedArray could be optimized for the seeds that it knows about, and use whatever general tricks DelayedMatrixStats is using.
Totally agree that matrixStats should be folded into base R.
Received a valid push; starting a build. Commits are:
c99ef6d Develop again CRAN release of matrixStats - Remove...
Received a valid push; starting a build. Commits are:
d2b1f2f Update API table in README and vignette
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005190015.html
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS, skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005191011.html
Received a valid push; starting a build. Commits are:
9b143f8 Small changes vignette - Fix typos - Allow compili...
'BiocCheck' is not recognized as an internal or external command, operable program or batch file.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005193439.html
Received a valid push; starting a build. Commits are:
8761a73 Version bump to kick off new build
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171006151438.html
package 'matrixStats' was built under R version 3.4.2
, concerns a dependency and not DelayedMatrixStatsYou're right. We'll have a review to you soon.
Any updates on this, @dvantwisk ? I appreciate that there were many packages submitted just before the deadline and that you are all swamped with reviews.
My main question is probably for @hpages and/or @mtmorgan - would you be interested in folding this package directly into DelayedArray? I'm of course willing to do the grunt work and write up a pull request. Obviously, this is now unlikely to happen in time for the next BioC release, but I'd rather get this right, making it broadly useful to developers, than rushing it. Please let me know what you think of this idea or whether you think it is better to contribute + maintain as a separate package. Thanks!
I am reviewing it now, I will try to have something for you shortly.
I am just finishing up looking through you R files. The package looks expertly written and am just checking off on a few more things in your R code, to see if I have any issues with it at all. The review below is for everything aside from the R code, I'll give you this first so you can correct these issues:
GENERAL:
.gitignore
file:
DelayedMatrixStats.Rproj
.travis.yml
buggy
codecov.yml
man-roxygen
appears to be unnecessary. It seems all relevant man pages
are included in the man
directory.DESCRIPTION:
MIT
is fine.Received a valid push; starting a build. Commits are:
be31f4e Update LICENSE Remove buggy/
Thanks, @dvantwisk!
Regarding those extra files:
buggy
contains old code, which I have removed in https://github.com/PeteHaitch/DelayedMatrixStats/commit/be31f4e59aa60bf524f89370f8b6ac4d91721278DelayedMatrixStats.Rproj
, .travis.yml
, codecov.yml
are files I use to integrate with RStudio, Travis CI, and CodeCov, respectivelyman-roxygen
is necessary to generate the .Rd
files (it is used for 'templated parameters' with roxygen2). Files in this directory are not necessary to build the package on the BioC servers, but are necessary for local development of the package.So I have deleted (1). However, if I add (2) and (3) to .gitignore then when I (or anyone else) clone the package from the BioC git server the package code will be incomplete for development purposes. What I would like is for these to be part of the git repository but not part of the R package, so my solution was to put them in .Rbuildignore
but not in .gitignore
. Is there a better way?
Regarding the license, I was following the advice from http://r-pkgs.had.co.nz/description.html . This seems to be used by other BioC packages with an MIT license (e.g., http://bioconductor.org/packages/ccrepe/ http://bioconductor.org/packages/LedPred/, http://bioconductor.org/packages/isomiRs/)
Thanks, Pete
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171016135519.html
Okay, I see the rational so those files can stay out of .gitignore
. Aside from that, I can't find any issues with the R code. I'm waiting for a supervisor to answer a question, but aside from that, I'm going to move to accept the package at the end of the day
Thanks for your review, @dvantwisk!
Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be sent to the maintainer email address in the next several days.
Thank you for contributing to Bioconductor!
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/PeteHaitch.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite(\"DelayedMatrixStats\")
. The package 'landing page' will be created at
https://bioconductor.org/packages/DelayedMatrixStats
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.