mllg / checkmate

Fast and versatile argument checks
https://mllg.github.io/checkmate/
Other
261 stars 30 forks source link

Fix: testNumeric() function treated the matrix as numeric #253

Open randef1ned opened 8 months ago

randef1ned commented 8 months ago

This PR fixed a bug that testNumeric() function treated the matrix as numeric.

> testNumeric(2)
[1] TRUE
> testNumeric(matrix(2))  # The bug appears
[1] TRUE
SamGG commented 8 months ago

Hi, x being a matrix, my understanding is that testMatrix(x) checks that x has the shape/characteristics of a matrix and testNumeric(x) checks the values of x are in a certain range. I don't understand why you consider testNumeric(matrix(2)) should return FALSE. Could you elaborate? Best.

randef1ned commented 8 months ago

Hello SamGG,

Your understanding is right and reasonable. However, the function testNumeric checks whether an argument is a numeric vector according to the documentation. It is misleading that a user (like me) assumes that the testNumeric function can automatically reject other data types, which would raise errors or yield unwanted results. For example, I would like to add a matrix and a vector. It would raise an error if I input a matrix. Therefore, it is necessary to return false for testNumeric.

> a <- matrix(1:4, nrow = 2); b <- 2
> test_numeric(b, lower = 0)   # b passed the check
[1] TRUE
> a + b    # b is expected to be a numeric vector; no error
     [,1] [,2]
[1,]    3    5
[2,]    4    6
> c <- matrix(2)    # c is a matrix, but it can pass the check
> test_numeric(c, lower = 0)
[1] TRUE
> a + c        # throw error because c is not a numeric vector but a matrix
Error in a + c : non-conformable arrays

According to your understanding, I think it is better to create a new function, e.g., testNumericRange(x), to check the values of x that are in a certain range to avoid any misleading and confoundings.

tdeenes commented 8 months ago

@randef1ned

You shall consider testNumeric as a more powerful alternative to base::is.numeric. Note that is.numeric(matrix(1)) returns TRUE and testNumeric(matrix(1)) does the same, which is the right thing in the current context. If you really need an atomic vector, you shall explicitly test for it, so use testNumeric(x) && testAtomicVector(x). You can easily create your custom check function and then use makeAssertionFunction and makeExpectationFunction to get your own test[check/assert/expect]NumericVector.

SamGG commented 8 months ago

The doc of checkmate::testNumeric says Check that an argument is a vector of type numeric. It should avoid the misleading term vector. Maybe Check that an argument is of type numeric is more in phase with base::is.numeric and what I like it currently performs.

randef1ned commented 8 months ago

Got it. Thanks @tdeenes

It is expected that the checkmate package will behave exactly as base R. Maybe we should inform the users of this information in the documentation.

I agree with the suggestion proposed by @SamGG that the documentation should be modified to avoid any misleading. Dropping the word vector in the documentation seems to break some packages relying on the checkmate package, because a regular user would get to know the packages through the examples and documentation and would never read the source code inside a package. According to the documentation, the testNumeric(x) is a function to check whether x is a numeric vector. Perhaps we need to conduct some surveys to see the understandings of the users.

checkmate is a widely-used R package, so, any changes to the existing function should be thoroughly evaluated.

randef1ned commented 8 months ago

I have just done a quick survey. I searched the testNumeric( function in all R scripts hosted on GitHub. Among the top 40 of the search results, only one repo (MichaelChirico/distr6) checks the input x as a matrix or vector. Therefore, I think it is unsafe to drop the word vector in the testNumeric documentation. Here is the result table:

repo file line type of x
xoopR/distr6 R/Wrapper_MixtureDistribution.R 53 numeric vector
    64 numeric vector
mlr-org/mlr R/StackedLearner.R 666 numeric vector
emanuelhuber/RGPR R/checkArgs.R 55 number
    63 number
    72 numeric vector
    80 numeric vector
    96 number
    107 number
    119 number
    133 number
RaphaelS1/Rsmisc R/makeChecks.R 22 number
nlmixr2/rxode2parse R/rudf.R 466 number
wpgp/wopr R/parseFilename.R 48 unknown
    56 unknown
    58 unknown
mlr-org/mlrMBO R/evalTargetFun.R 52 numeric vector
jakobbossek/smoof R/makeSingleObjectiveFunction.R 70 numeric vector
bips-hb/innsight R/Converter.R 1259 integer or vector
SixiangHu/DataMan R/DataSummary.R 47 numeric vector
    60 numeric vector
    75 numeric vector
rubak/spatstat.revdep geometr/R/gt_stretch.R 36 number
    38 number
juliedwhite/miamiplot R/check_miami_input.R 43 number
dimfalk/kostra2010R R/get_centroid.R 35 numeric vector
aschersleben/cointReg R/check-arguments.R 235 numeric vector
jakobbossek/netgen R/autoplotNetwork.R 42 numeric vector
nlmixr2/rxode2 R/rxsolve.R 827 number
yonicd/typeR data/genthat_extracted_code/checkmate/examples/checkNumeric.Rd.R 11 number
    12 number
mlr-org/mlrCPO R/FormatCheck.R 626 numeric vector
    684 numeric vector
MichaelChirico/distr6 R/SDistribution_MultivariateNormal.R 217 matrix | vector
    257 matrix | vector
jordansquair/splatter_batch R/KersplatParams-methods.R 78 unknown
bioc/SNPhood R/DataClasses.R 924 unknown
    1101 unknown
    1106 unknown
    1121 unknown
    1137 unknown
    1145 unknown
eddelbuettel/checkmate R/checkNumeric.R 17 number
    18 number
wibeasley/checkmate R/checkNumeric.R 22 number
    23 number
dimfalk/netatmo.weather R/get_extent.R 31 numeric vector
aschersleben/NBCD R/naiveBayes.R 63 numeric vector
rkayastha/RGPR R/checkArgs.R 55 number
    63 numeric vector
    71 numeric vector
    87 number
    98 number
    110 number
    124 number
thomascherickal/mlr R/StackedLearner.R 650 numeric vector
cran/triact R/001_load_methods.R 38 integer vector
mnwright/mlr R/StackedLearner.R 649 numeric vector
jakob-r/mlr R/StackedLearner.R 650 numeric vector