HenrikBengtsson / matrixStats

R package: Methods that Apply to Rows and Columns of Matrices (and to Vectors)
https://cran.r-project.org/package=matrixStats
203 stars 33 forks source link

'colVars' is defunct? #222

Closed hpages closed 2 years ago

hpages commented 2 years ago

I don't think so but that's what the error message below says:

library(matrixStats)
m0 <- matrix(0, nrow=12, ncol=9)
set.seed(123)
m0[sample(length(m0), 12)] <- round(20*runif(12), 3)

colSds(m0, center=-1:7)
# Error: 'colVars' is defunct.
# Use 'colVars() was called with a 'center' argument that does not meet the
# assumption that estimating the variance using the 'primary' or the 'alternative'
# formula does not matter as they should give the same results. This suggests
# a misunderstanding on what argument 'center' should be. The reason was:
# Mean relative difference: 0.8455205' instead. See help("Defunct")

Several problems with this error message:

  1. I called colSds(), not colVars(), so it's strange that I get an error telling me that the latter is defunct.
  2. AFAIK colVars() is not defunct.
  3. The overall explanation given in the message is rather cryptic, especially the last sentence which seems to be missing something:
    The reason was: Mean relative difference: 0.8455205' instead.
  4. I only get the error once. Subsequent calls work fine:
    colSds(m0, center=-1:7)
    # [1] 5.770199 0.000000 2.979597 2.088932 3.065407 7.234541 5.222330 6.282948
    # [9] 7.591849

Thanks, H.

sessionInfo():

R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-4.2.0/lib/libRblas.so
LAPACK: /home/hpages/R/R-4.2.0/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] matrixStats_0.62.0

loaded via a namespace (and not attached):
[1] compiler_4.2.0 tools_4.2.0   
HenrikBengtsson commented 2 years ago

For those who don't know the history, the background for this can be found in https://github.com/HenrikBengtsson/matrixStats/issues/183.

  1. I called colSds(), not colVars(), so it's strange that I get an error telling me that the latter is defunct.

Updated message to also mention colSds(), cf.

  1. AFAIK colVars() is not defunct.

This was fixed in commit 389e3d1766bd256742c5d1300db80ba404aed788.

  1. The overall explanation given in the message is rather cryptic, especially the last sentence which seems to be missing something:

The reason was: Mean relative difference: 0.8455205' instead.

The stray "instead" was appended by .Defunct(), which was automatically fixed by commit 389e3d1766bd256742c5d1300db80ba404aed788.

  1. I only get the error once. Subsequent calls work fine: ...

I've added (commit 32b7d795) a section to help("rowVars") on how center is validated and updated the error message too;

> library(matrixStats)
> m0 <- matrix(0, nrow=12, ncol=9)
> colSds(m0, center=-1:7)
Error: Detected incorrect use of argument 'center' for colVars() or colCols(). The value
of 'center' does not meet the assumption that estimating the variance using the 'primary' 
or the 'alternative' formula does not matter as they should give the same results, which
suggests a misunderstanding on what argument 'center' should be. Please see 
help("rowVars", package = "matrixStats"). The reason was: Mean relative difference: 2
(this validation is performed every 50 call to this function per R option
'matrixStats.vars.formula.freq')
const-ae commented 2 years ago

I think the new error message might have a typo: "argument 'center' for colVars() or colCols"?

HenrikBengtsson commented 2 years ago

I think the new error message might have a typo: "argument 'center' for colVars() or colCols"?

Thanks for spotting this. Fixed.

hpages commented 2 years ago

Thanks Henrik.

Note that the matrixStats.vars.formula.freq option does not seem to be working as advertised (I'm using matrixStats 0.62.0-9002):

library(matrixStats)

rowVars(matrix(1:12, nrow=4), center=11:14)
# Error: Detected incorrect use of argument 'center' for rowVars() or rowSds(). The value of 'center'
# does not meet the assumption that estimating the variance using the 'primary' or the 'alternative'
# formula does not matter as they should give the same results, which suggests a misunderstanding
# on what argument 'center' should be. Please see help("rowVars", package = "matrixStats").
# The reason was: Mean relative difference: 3.214286 (this validation is performed every 50 call to
# this function per R option 'matrixStats.vars.formula.freq')

getOption("matrixStats.vars.formula.freq")
# [1] 50
options(matrixStats.vars.formula.freq=2)

rowVars(matrix(1:12, nrow=4), center=11:14)
# [1] 70 70 70 70
rowVars(matrix(1:12, nrow=4), center=11:14)
# [1] 70 70 70 70
rowVars(matrix(1:12, nrow=4), center=11:14)
# [1] 70 70 70 70
rowVars(matrix(1:12, nrow=4), center=11:14)
# [1] 70 70 70 70
# etc...
HenrikBengtsson commented 2 years ago

the matrixStats.vars.formula.freq option does not seem to be working as advertised

There was a thinko in the code. It would indeed eventually acknowledge the new setting, but not until the previous one has completed a cycle (i.e. after 50 times). Fixed in commit 2c29e05