Closed MichaelChirico closed 4 years ago
Created attachment 2142 [details] Proposed patch
I can reproduce the issue (on Linux). The problem occurs on R-devel (r71129) and various release versions of R (tested on 3.0.2, 3.2.5 and 3.3.1 patched r71063).
I think the cause of the segfault is an underflow of the unsigned variable 'n' in fgrepraw1(), defined in src/main/grep.c. This can be avoided by checking if 'pat' is longer than 'text'. The attached patch returns the "no match" solution in such cases.
Another problem fixed in the patch is the failure to match in some cases when a match is expected. This happens if 'pat' has more than 3 bytes (the "default" branch of fgrepraw1) and there is nothing in 'text' following the match.
Example with R-devel r71129:
grepRaw("abcd", "abcd", fixed = TRUE)
integer(0)
grepRaw("abcd", "abcde", fixed = TRUE)
[1] 1
Example with the patch applied:
grepRaw("abcd", "abcd", fixed = TRUE)
[1] 1
grepRaw("abcd", "abcde", fixed = TRUE)
[1] 1
confirmed; incl. your patch.
Thank you very much ... fixed in R-devel and soon 'R 3.3.2 patched'
Executing the following causes a segfault relatively consistently for me:
It seems like the segfault occurs only when the 'pattern' is longer than the input string, and
fixed = TRUE
has been specified.Running a recent-ish R-devel built with clang-3.9 + sanitizers:
================================================================= ==18741==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61d00685f430 at pc 0x00010d7c020c bp 0x7fff541c9bd0 sp 0x7fff541c9bc8 READ of size 1 at 0x61d00685f430 thread T0
0 0x10d7c020b in fgrepraw1 (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x42c20b)
LLVMSymbolizer: error reading file: No object file for requested architecture
10 0x7fff99faa5ac (/usr/lib/system/libdyld.dylib+0x35ac)
R Under development (unstable) (2016-08-12 r71086) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: OS X El Capitan 10.11.6
METADATA