HenrikBengtsson / matrixStats

R package: Methods that Apply to Rows and Columns of Matrices (and to Vectors)
https://cran.r-project.org/package=matrixStats
202 stars 33 forks source link

CRAN: Additional issues found by rchk #248

Closed HenrikBengtsson closed 3 months ago

HenrikBengtsson commented 4 months ago

From https://raw.githubusercontent.com/kalibera/cran-checks/master/rchk/results/matrixStats.out:

Package matrixStats version 1.2.0
Package built using 86189/R 4.4.0; x86_64-pc-linux-gnu; 2024-03-25 22:19:23 UTC; unix   
Checked with rchk version fdc068715daa3a256062cc20e0d4a5157dacc9a4 LLVM version 14.0.6
More information at https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md
For rchk in docker image see https://github.com/kalibera/rchk/blob/master/doc/DOCKER.md

Function diff2
  [UP] unprotected variable ans while calling allocating function Rf_getAttrib(?,S:names) matrixStats/src/diff2.c:62
  [UP] allocating function setNamesDiff(V,?,?,?,?) may destroy its unprotected argument (ans <arg 1>), which is later used. matrixStats/src/diff2.c:64
  [UP] calling allocating function setNamesDiff(V,?,?,?,?) with a fresh pointer (ans <arg 1>) matrixStats/src/diff2.c:64
  [UP] calling allocating function setNamesDiff(V,?,?,?,?) with a fresh pointer (namesVec <arg 2>) matrixStats/src/diff2.c:64
yaccos commented 4 months ago

I see the problem. Lines 52 and 56 of diff2.c UNPROTECTs the answer vector even though the vector is used later for naming. Hence, unprotecting the answer must take place after the chunk of code related to naming. I assume it is save to move the statement UNPROCTECT(1) to right before the return of the function because exactly one protected answer vector is always allocated unless the function throws an error and thus never returns.

https://github.com/HenrikBengtsson/matrixStats/blob/9d3c1e6e815abcc16b1821a871e64838c5f65f6e/src/diff2.c#L48C2-L59C4

HenrikBengtsson commented 4 months ago

Thxs. Would you mind providing a patch?

HenrikBengtsson commented 4 months ago

Following the instructions on https://github.com/kalibera/cran-checks/tree/master/rchk, I can reproduce the rchk report on the CRAN version using:

$ docker run -v "$PWD/packages:/rchk/packages" kalibera/rchk:latest /rchk/packages/matrixStats_1.2.0.tar.gz
source("/rchk/scripts/utils.r"); install_package_libs("/rchk/packages/matrixStats_1.2.0.tar.gz")
Warning: unable to access index for repository https://master.bioconductor.org/packages/3.11/bioc/src/contrib:
  cannot open URL 'https://master.bioconductor.org/packages/3.11/bioc/src/contrib/PACKAGES'
Warning: unable to access index for repository https://master.bioconductor.org/packages/3.11/data/annotation/src/contrib:
  cannot open URL 'https://master.bioconductor.org/packages/3.11/data/annotation/src/contrib/PACKAGES'
Warning: unable to access index for repository https://master.bioconductor.org/packages/3.11/data/experiment/src/contrib:
  cannot open URL 'https://master.bioconductor.org/packages/3.11/data/experiment/src/contrib/PACKAGES'
Warning: unable to access index for repository https://master.bioconductor.org/packages/3.11/workflows/src/contrib:
  cannot open URL 'https://master.bioconductor.org/packages/3.11/workflows/src/contrib/PACKAGES'
* installing *source* package 'matrixStats' ...
not using staged install with --libs-only
** using non-staged installation
** libs
using C compiler: 'Ubuntu clang version 14.0.0-1ubuntu1'
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c 000.init.c -o 000.init.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c allocMatrix2.c -o allocMatrix2.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c anyMissing.c -o anyMissing.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c binCounts.c -o binCounts.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c binMeans.c -o binMeans.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c colCounts.c -o colCounts.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c colOrderStats.c -o colOrderStats.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c colRanges.c -o colRanges.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c diff2.c -o diff2.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c indexByRow.c -o indexByRow.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c logSumExp.c -o logSumExp.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c mean2.c -o mean2.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c naming.c -o naming.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c productExpSumLog.c -o productExpSumLog.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c psortKM.c -o psortKM.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowCounts.c -o rowCounts.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowCummaxs.c -o rowCummaxs.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowCummins.c -o rowCummins.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowCumprods.c -o rowCumprods.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowCumsums.c -o rowCumsums.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowDiffs.c -o rowDiffs.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowLogSumExp.c -o rowLogSumExp.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowMads.c -o rowMads.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowMeans2.c -o rowMeans2.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowMedians.c -o rowMedians.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowOrderStats.c -o rowOrderStats.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowRanges.c -o rowRanges.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowRanksWithTies.c -o rowRanksWithTies.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowSums2.c -o rowSums2.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c rowVars.c -o rowVars.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c signTabulate.c -o signTabulate.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c sum2.c -o sum2.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c validateIndices.c -o validateIndices.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c weightedMean.c -o weightedMean.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c weightedMedian.c -o weightedMedian.o
wllvm -I"/rchk/build/lib/R/include" -DNDEBUG   -I/usr/local/include    -fPIC  -Wall -g -O0 -fPIC  -c x_OP_y.c -o x_OP_y.o
wllvm -shared -L/usr/local/lib -o matrixStats.so 000.init.o allocMatrix2.o anyMissing.o binCounts.o binMeans.o colCounts.o colOrderStats.o colRanges.o diff2.o indexByRow.o logSumExp.o mean2.o naming.o productExpSumLog.o psortKM.o rowCounts.o rowCummaxs.o rowCummins.o rowCumprods.o rowCumsums.o rowDiffs.o rowLogSumExp.o rowMads.o rowMeans2.o rowMedians.o rowOrderStats.o rowRanges.o rowRanksWithTies.o rowSums2.o rowVars.o signTabulate.o sum2.o validateIndices.o weightedMean.o weightedMedian.o x_OP_y.o
installing to /rchk/packages/libsonly/matrixStats/libs
* DONE (matrixStats)
Installed libraries of package  matrixStats 
[1] "/rchk/packages/libsonly/matrixStats/libs/matrixStats.so"

ERROR: too many states (abstraction error?) in function strptime_internal
ERROR: too many states (abstraction error?) in function RunGenCollect

Function diff2
  [UP] unprotected variable ans while calling allocating function Rf_getAttrib(?,S:names) /rchk/packages/build/dYhEbrkZ/matrixStats/src/diff2.c:62
  [UP] allocating function setNamesDiff(V,?,?,?,?) may destroy its unprotected argument (ans <arg 1>), which is later used. /rchk/packages/build/dYhEbrkZ/matrixStats/src/diff2.c:64
  [UP] calling allocating function setNamesDiff(V,?,?,?,?) with a fresh pointer (ans <arg 1>) /rchk/packages/build/dYhEbrkZ/matrixStats/src/diff2.c:64
  [UP] calling allocating function setNamesDiff(V,?,?,?,?) with a fresh pointer (namesVec <arg 2>) /rchk/packages/build/dYhEbrkZ/matrixStats/src/diff2.c:64
Analyzed 276 functions, traversed 22484 states.
Library name (usually package name): matrixStats
Initialization function: R_init_matrixStats
Functions: 37
Checked call to R_registerRoutines: 1

Rchk version: 1cae90e208e97a5c41f1c3e128d99b197478443e
R version: 84255/R Under development (unstable) (2023-04-13 r84255)
LLVM version: 14.0.0

When running on the patch version in the develop branch, there is one remaining issue:

$ docker run -v "$PWD/packages:/rchk/packages" kalibera/rchk:latest /rchk/packages/matrixStats_1.2.0-9000.tar.gz
source("/rchk/scripts/utils.r"); install_package_libs("/rchk/packages/matrixStats_1.2.0-9000.tar.gz")
...
Function diff2
  [UP] calling allocating function setNamesDiff(V,?,?,?,?) with a fresh pointer (namesVec <arg 2>) /rchk/packages/build/Y7pSfikM/matrixStats/src/diff2.c:62
Analyzed 276 functions, traversed 22484 states.
Library name (usually package name): matrixStats
Initialization function: R_init_matrixStats
Functions: 37
Checked call to R_registerRoutines: 1

Rchk version: 1cae90e208e97a5c41f1c3e128d99b197478443e
R version: 84255/R Under development (unstable) (2023-04-13 r84255)
LLVM version: 14.0.0
HenrikBengtsson commented 3 months ago

Fixed the remaining 'rchk' issue, cf. commit 9f045980

yaccos commented 3 months ago

I wondered what the latest rchk issue was all about because attributes of a protected object is by itself protected. According to this blog post and the rchk guide, it may have been a false positive by rchk. However, whether getAttrib() actually allocates is a so complex question that the rchk developers have decided to err on the side of caution.