Closed daoud-sie closed 8 years ago
Hi Daoud, thanks for the report. Your observations are correct; some function calls to matrixStats may break because they assume a certain order of arguments. The solution is to use named arguments. I should have added a note about this to the NEWS file, but too late now.
In matrixStats (>= 0.50.0), we introduced optional arguments idxs
, rows
, and cols
for optimized subsetted matrixStats calculations.
When adding these arguments we debated whether to add them (i) at the very end after all existing arguments, or (ii) after the arguments that should be subsetted by them. We went with the latter intentionally.
For example, with binCounts()
we used to have:
> args(binCounts)
function (x, bx, right = FALSE, ...)
but now we have
> args(binCounts)
function (x, idxs = NULL, bx, right = FALSE, ...)
Thus, if binCounts()
was called as:
binCounts(x, bx)
then, yes, it will break, because it effectively becomes:
binCounts(x, idxs=bx)
and not
binCounts(x, bx=bx)
as intended.
Update the code to use named arguments as far as possible in all matrixStats calls. The exception is the first argument, which does not have to be named. This also applies to the second argument if that specify weights, e.g.
> args(weightedMean)
function (x, w = NULL, idxs = NULL, na.rm = FALSE, refine = FALSE, ...)
Before submitting to CRAN, I ran R CMD check --as-cran
on all 68 packages that depends on matrixStats to make sure the updated matrixStats version does not break any of the packages relying on it. This of course assumes that the package tests of those packages actually test all of its code that calls matrixStats.
I have now scanned the code of those 68 packages using static code inspection (partly by looking manually) and I only discovered one package that have this type of problem, namely (our) QDNAseq package, where the internal .binReadCountsPerSample()
function calls:
counts <- binCounts(hits[[chromosome]], chromosomeBreaks)
which if matrixStats (>= 0.50.0) is installed, equals:
counts <- binCounts(hits[[chromosome]], idxs=chromosomeBreaks)
I assume you are referring to this one.
Of course, we cannot check R scripts that are not part of CRAN or Bioconductor packages.
Thanks for the background info. I will use named arguments.
An extra argument is added in binCounts which will break code where no explicit argument names are used in calling the binCounts function. Some of my code assume "bx" to be the second argument. I can fix this in my code, but should we also fix it here?