Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

DelayedMatrixStats #514

Closed PeteHaitch closed 7 years ago

PeteHaitch commented 7 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 7 years ago

Hi @PeteHaitch

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: DelayedMatrixStats
Type: Package
Title: Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects
Version: 0.99.0
Authors@R: c(person("Peter", "Hickey", role = c("aut", "cre"),
    email = "peter.hickey@gmail.com"))
Author: Peter Hickey <peter.hickey@gmail.com>
Maintainer: Peter Hickey <peter.hickey@gmail.com>
Description: A port of the 'matrixStats' API for use with DelayedMatrix objects 
  from the 'DelayedArray' package. High-performing functions operating on rows 
  and columns of DelayedMatrix objects, e.g. col / rowMedians(), 
  col / rowRanks(), and col / rowSds(). Functions optimized per data type and 
  for subsetted calculations such that both memory usage and processing time is 
  minimized.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 6.0.1
Depends:
  DelayedArray
Imports:
  methods,
  matrixStats (>= 0.52.2-9000),
  Matrix,
  S4Vectors,
  IRanges
Suggests:
  testthat,
  HDF5Array,
  knitr,
  rmarkdown,
  covr,
  BiocStyle,
  microbenchmark,
  profmem
VignetteBuilder: knitr
URL: https://github.com/PeteHaitch/DelayedMatrixStats
BugReports: https://github.com/PeteHaitch/DelayedMatrixStats/issues
Remotes: HenrikBengtsson/matrixStats@develop
biocViews: Infrastructure, DataRepresentation, Software
PeteHaitch commented 7 years ago

Ah, bugger. I just remembered I'm depending on a version of matrixStats that isn't yet available on CRAN. The fix I'm relying on is https://github.com/HenrikBengtsson/matrixStats/issues/110

I'll take another look at this tonight to see if I can get it working with the released version of matrixStats.

The next CRAN release of matrixStats likely won't come out in time for the next BioC release (https://github.com/HenrikBengtsson/matrixStats/issues/114), so I'll need to workaround https://github.com/HenrikBengtsson/matrixStats/issues/110. It's a little ugly because it will involve a change to the order of the arguments in a generic (albeit a rarely used generic/method), so in the initial BioC release of DelayedMatrixStats the arguments would be the older, 'wrong' order and then a patch applied to DelayedMatrixStats to update to the newer, 'correct' order once CRAN matrixStats is updated to include the patch.

bioc-issue-bot commented 7 years ago

Your package has been approved for building. Your package is now submitted to our queue.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005171138.html

PeteHaitch commented 7 years ago

Since this package interacts so closely with the DelayedArray package, I have some issues I would appreciate some advice on. See https://github.com/PeteHaitch/DelayedMatrixStats/labels/initial_release for the ones I think are pertinent for the initial release. Thanks!

lawremi commented 7 years ago

I wonder if some or all of these could just be folded into the DelayedArray package itself? Seems problematic to have a colSums() and colSums2().

PeteHaitch commented 7 years ago

I appreciate that concern. matrixStats::colSums2() has extra functionality over base::colSums().

library(matrixStats)

args(colSums)
#> function (x, na.rm = FALSE, dims = 1L) 
#> NULL

args(colSums2)
#> function (x, rows = NULL, cols = NULL, na.rm = FALSE, dim. = dim(x), 
#>     ...) 
#> NULL

The main benefit of colSums2()/rowSums2()/colMeans2()/rowMeans2() is the capability to compute these summaries over subsets of the data while avoiding a copy of x. While this behaviour may not be achievable for all types of DelayedMatrix objects, I'd like to retain the same API as matrixStats wherever possible.

lawremi commented 7 years ago

But what's the point of those arguments when you can just lazily subset the object?

PeteHaitch commented 7 years ago

But what's the point of those arguments when you can just lazily subset the object?

Yes, that's what I mean by "this behaviour may not be achievable for all types of DelayedMatrix objects". However, by at least offering colSums2() the same code will work for matrix and DelayedMatrix objects.

lawremi commented 7 years ago

I can sort of see the desire for compatibility with matrixStats. But why not give colSums() the same treatment as colSums2()? I think we really want people calling the standard function.

PeteHaitch commented 7 years ago

But why not give colSums() the same treatment as colSums2()? I think we really want people calling the standard function.

So from my perspective the ultimate would be to have base::colSums() have the capabilities of matrixStats::colSums2(), but I suspect that's not what you mean :) And to clarify, colSums() does already exist in DelayedArray (and uses the block-processing strategy), but, yes, some of the tricks used in colSums2() could be brought over (e.g., 'seed-aware' methods).

Re your earlier comment,

I wonder if some or all of these could just be folded into the DelayedArray package itself?

Again, ultimately, that may be the right thing to do. Right now, at least for me, decoupling the two makes development a little easier. I'd certainly be willing to work with @hpages to fold the entire package into DelayedArray if it was decided this was the best way forward.

lawremi commented 7 years ago

Ok, cool, it would be great if DelayedArray could be optimized for the seeds that it knows about, and use whatever general tricks DelayedMatrixStats is using.

Totally agree that matrixStats should be folded into base R.

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

c99ef6d Develop again CRAN release of matrixStats - Remove...

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

d2b1f2f Update API table in README and vignette

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005190015.html

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005191011.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

9b143f8 Small changes vignette - Fix typos - Allow compili...

PeteHaitch commented 7 years ago
bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171005193439.html

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

8761a73 Version bump to kick off new build

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171006151438.html

PeteHaitch commented 7 years ago
dvantwisk commented 7 years ago

You're right. We'll have a review to you soon.

PeteHaitch commented 7 years ago

Any updates on this, @dvantwisk ? I appreciate that there were many packages submitted just before the deadline and that you are all swamped with reviews.

My main question is probably for @hpages and/or @mtmorgan - would you be interested in folding this package directly into DelayedArray? I'm of course willing to do the grunt work and write up a pull request. Obviously, this is now unlikely to happen in time for the next BioC release, but I'd rather get this right, making it broadly useful to developers, than rushing it. Please let me know what you think of this idea or whether you think it is better to contribute + maintain as a separate package. Thanks!

dvantwisk commented 7 years ago

I am reviewing it now, I will try to have something for you shortly.

dvantwisk commented 7 years ago

I am just finishing up looking through you R files. The package looks expertly written and am just checking off on a few more things in your R code, to see if I have any issues with it at all. The review below is for everything aside from the R code, I'll give you this first so you can correct these issues:

GENERAL:

DESCRIPTION:

bioc-issue-bot commented 7 years ago

Received a valid push; starting a build. Commits are:

be31f4e Update LICENSE Remove buggy/

PeteHaitch commented 7 years ago

Thanks, @dvantwisk!

Regarding those extra files:

  1. buggy contains old code, which I have removed in https://github.com/PeteHaitch/DelayedMatrixStats/commit/be31f4e59aa60bf524f89370f8b6ac4d91721278
  2. DelayedMatrixStats.Rproj, .travis.yml, codecov.yml are files I use to integrate with RStudio, Travis CI, and CodeCov, respectively
  3. man-roxygen is necessary to generate the .Rd files (it is used for 'templated parameters' with roxygen2). Files in this directory are not necessary to build the package on the BioC servers, but are necessary for local development of the package.

So I have deleted (1). However, if I add (2) and (3) to .gitignore then when I (or anyone else) clone the package from the BioC git server the package code will be incomplete for development purposes. What I would like is for these to be part of the git repository but not part of the R package, so my solution was to put them in .Rbuildignore but not in .gitignore. Is there a better way?

Regarding the license, I was following the advice from http://r-pkgs.had.co.nz/description.html . This seems to be used by other BioC packages with an MIT license (e.g., http://bioconductor.org/packages/ccrepe/ http://bioconductor.org/packages/LedPred/, http://bioconductor.org/packages/isomiRs/)

Thanks, Pete

bioc-issue-bot commented 7 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the following build report for more details:

http://bioconductor.org/spb_reports/DelayedMatrixStats_buildreport_20171016135519.html

dvantwisk commented 7 years ago

Okay, I see the rational so those files can stay out of .gitignore. Aside from that, I can't find any issues with the R code. I'm waiting for a supervisor to answer a question, but aside from that, I'm going to move to accept the package at the end of the day

PeteHaitch commented 7 years ago

Thanks for your review, @dvantwisk!

bioc-issue-bot commented 7 years ago

Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be sent to the maintainer email address in the next several days.

Thank you for contributing to Bioconductor!

mtmorgan commented 7 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/PeteHaitch.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite(\"DelayedMatrixStats\"). The package 'landing page' will be created at

https://bioconductor.org/packages/DelayedMatrixStats

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.