RcppCore / Rcpp

Seamless R and C++ Integration
https://www.rcpp.org
GNU General Public License v2.0
723 stars 208 forks source link

New CRAN issues #1308

Open eddelbuettel opened 4 weeks ago

eddelbuettel commented 4 weeks ago

Per incoming email

With current R-devel, we are seeing new segfaults and corresponding UBSAN problems for packages

VAJointSurv fixest kdevine maxnodf o2geosocial openxlsx reservr vapour

with fixest and openxlsx showing

#0 0x7fd6892ef904 in Rcpp::traits::r_vector_cache<14, Rcpp::PreserveStorage>::ref(long) /data/gannet/ripley/R/test-dev/Rcpp/include/Rcpp/vector/traits.h:46
#1 0x7fd6892ef904 in Rcpp::Vector<14, Rcpp::PreserveStorage>::operator[](long) /data/gannet/ripley/R/test-dev/Rcpp/include/Rcpp/vector/Vector.h:340

and the others similarly. Can you please have a look?

[....] tells me that there are also new UBSAN issues for

BART FLOPART FLSSS LOPAR aum lme4 magi nanotime ondisc pedmod rsparse supc xrnet

most of which point to Rcpp (lme4 seems from Eigen).

with short followup

Apparently this is from

r86629 | luke | 2024-05-27 00:29:48 +0200 (Mon, 27 May 2024) | 3 lines Hide some more functions in inlined.c. Also define STRICT_TYPECHECK when compiling inlined.c.

kevinushey commented 4 weeks ago

I was able to reproduce this locally on macOS; I'll see if I can learn more.

eddelbuettel commented 4 weeks ago

Which one did you try? openxlsx may be the smaller of the two.

kevinushey commented 4 weeks ago

Indeed, I tried with openxlsx. Next step is to try and narrow down a reproducible example further...

kevinushey commented 4 weeks ago

This seems like potentially a bug in openxlsx? Here's a more minimal example:

library(openxlsx)
trace(openxlsx:::read_workbook, quote(print(utils::ls.str())))
xlsxFile <- system.file("extdata", "readTest.xlsx", package = "openxlsx")
read.xlsx(xlsxFile, 2, startRow = 3, colNames = FALSE, detectDates = TRUE)

The function read_workbook gets called with:

Browse[1]> ls.str()
clean_names : function (x, schar)  
cols_in :  int [1:88] 1 2 3 4 5 6 7 8 9 1 ...
hasColNames :  logi FALSE
hasSepNames :  chr "."
is_date :  logi [1:88] FALSE FALSE FALSE FALSE FALSE FALSE ...
nRows :  int 32
rows_in :  int [1:88] 3 3 3 3 3 3 3 3 3 4 ...
skipEmptyCols :  logi TRUE
skipEmptyRows :  logi TRUE
string_inds :  int(0) 
v :  chr [1:88] "1" "2" "3" "4" "5" "6" "7" "8" "9" "1" "2" "3" "4" "5" "6" "7" "8" "8" "2" "2" "3" "4" "4" ...

Note that string_inds is an empty vector. That eventually gets read with code in read_workbook.cpp:503:

  st_inds0[0] = string_inds[0];

So they're attempting to read from a zero-length integer vector, which fails. I'm not sure why that would've snuck by tests in the past, though.

eddelbuettel commented 4 weeks ago

We were just talking to @JanMarvin the other day in another issue so let's call him. I am sure he heard from CRAN too ...

Any insight from your end?

kevinushey commented 4 weeks ago

It looks like vapour might be running into something similar? I'm not positive, but there is code like:

  GDALDatasetH ds = gdalraster::gdalH_open_dsn(dsource[0],   0);  

where gdalH_open_dsn has the definition:

// open the DSN, open a subdataset if sds > 0 (else you get the outer shell)
inline GDALDatasetH gdalH_open_dsn(const char * dsn, IntegerVector sds) {
  GDALDatasetH DS; 
  DS = GDALOpen(dsn, GA_ReadOnly);
  if (!DS) return nullptr; 
  if (sds[0] > 0 && gdal_has_subdataset((GDALDataset*) DS)) {
    CharacterVector sdsnames = gdal_subdataset_1((GDALDataset*)DS, sds[0]);
    if (sdsnames.length() > 0 && !sdsnames[0].empty()) {
      GDALClose((GDALDataset*) DS);
      DS = GDALOpen(sdsnames[0], GA_ReadOnly);
    }

  }
  return DS;
}

That is, an IntegerVector is being constructed implicitly from the int 0, which gives a zero-length integer vector rather than an integer vector of length 1 with value 0 (which the author probably expected / assumed). That implies the sds[0] check in that function will then give an out-of-bounds access into a length-zero vector.

kevinushey commented 4 weeks ago

I also took a look at fixest and it looks like another variation on the same issue -- an attempt to read / write into a zero-length vector. Here's a reproducible example (distilled from the failing example code)

library(fixest)

set.seed(0)

base = iris
names(base) = c("y", "x1", "x2", "x3", "fe1")
base$fe2 = rep(1:5, 30)
base$y[1:5] = NA
base$x1[4:8] = NA
base$x2[4:21] = NA
base$x3[110:111] = NA
base$fe1[110:118] = NA
base$fe2[base$fe2 == 1] = 0
base$fe3 = sample(letters[1:5], 150, TRUE)
base$period = rep(1:50, 3)
base$x_cst = 1

res = feols(c(y, x1) ~ 1 | fe1 | x2 ~ x3, base)

And indeed, the failure occurs at code like the following:

    XtX(0, 0) = value;

where XtX is a 0 x 0 matrix. It seems like this is the common thread between the various issues.

All that said, I'm not sure what we should do:

  1. Inform package authors that the issue most likely lies in their code, and ask them to review the issues individually?
  2. Include some code in Rcpp to help alleviate these issues, e.g. by giving an R warning or error on an out-of-bounds access to an Rcpp vector / matrix?
JanMarvin commented 4 weeks ago

Hi @eddelbuettel I'm poking at it in https://github.com/ycphs/openxlsx/pull/472 (but afaik nobody is actively working on openxlsx and I'm not its maintainer, I'm just following Rcpp 😄 )

eddelbuettel commented 4 weeks ago

@JanMarvin Whoops. My bad.

@kevinushey We do it some places (vector/Subsetter.h) and it might be hard to do it "universally" :-/

lrberge commented 3 weeks ago

On the fixest bug. Thanks @kevinushey for looking into it! It was definitely a bug. Now fixed.

AshesITR commented 2 weeks ago

reservr suffered from the same (UB working with length-0 input). Thank you for providing the hints here.

tdhock commented 1 week ago

Hi! I maintain 3 packages which are affected, all using Rcpp with the same pattern.

is that ok? If so I guess the fix will be provided in Rcpp? If not, I can update my packages. is there another/better way to get a pointer to the first element in the array?

I don't think my issues involve length-0 vectors since in that case my code stops with an error before trying to get the pointer, for example https://github.com/tdhock/FLOPART/blob/main/src/interface.cpp

  int data_count = data_vec.size();
  if(data_count < 2){
    Rcpp::stop("need at least two data points");
  }
Enchufa2 commented 1 week ago

@tdhock The bug is most probably in your package. Rcpp can just warn (for now) or error (in a future release) so that you are aware of these bugs before CRAN UBSAN/ASAN checks are reported.

Note that the UBSAN/ASAN errors I see on CRAN are not on line 55, but on line 57, where you do this:

&label_type_vec[0], &label_start_vec[0], &label_end_vec[0], label_count,

And I don't see checks for length-0 vectos for these ones.

eddelbuettel commented 1 week ago

@tdhock You could consider doing what I just did: reducing the test surface by not running the tests offending UBSAN.

tdhock commented 1 week ago

hi Inaki and Dirk thanks for the feedback, that is helpful. indeed I will update my packages to avoid doing &some_vec[0] on zero-length vectors, which actually can be provided by the user/tests sometimes. In the example above, my suggested fix is:

  const int *label_type_ptr=0, *label_start_ptr=0, *label_end_ptr=0;
  if(0 < label_count){
    label_type_ptr = &label_type_vec[0];
    label_start_ptr = &label_start_vec[0];
    label_end_ptr = &label_end_vec[0];
  }
  int status = FLOPART
    (&data_vec[0], &weight_vec[0],
     data_count, penalty,
     label_type_ptr, label_start_ptr, label_end_ptr, label_count,