hypertidy / decido

Constrained polygon triangulation by 'ear cutting'
https://hypertidy.github.io/decido
Other
15 stars 3 forks source link

CRAN detected memory bug #12

Closed mdsumner closed 6 years ago

mdsumner commented 6 years ago

"showing memory-access errors under Fedora 28: see the 'Additional issues' linked from the packages's CRAN results page.

Please correct ASAP and before Jul 30 to safely retain the package on CRAN"

mdsumner commented 6 years ago

@mpadge I have no idea how to figure out what this was, since the package is now archived and the CRAN results are gone. I can't reproduce a problem on rhub fedora-gcc or fedora-clang (though I do have vignette issues with path drawing in the png device there, which I don't think is related).

Could you review the C++ here please to see if any obvious probs?

mpadge commented 6 years ago

Shall do, next week

mdsumner commented 6 years ago

thank you!

mpadge commented 6 years ago

Can confirm there are errors. You can reproduce by making a test.R file with just devtools::test(), and then:

R -d "valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all" -f test.R
mpadge commented 6 years ago

More informative:

R -d "valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes -v" -f test.R

There are definitely malloc errors, which appear to arise from some kind of attempted UTF-8 conversion of a NULL string. Now to find the origin of these ...

mpadge commented 6 years ago

Hmmm... trickier than i first thought. The error arises in test-basic-index.R, and is something to do with the way the Rcpp vectors are passed through to mapbox::earcut.

mdsumner commented 6 years ago

It occurred to me that transformr also uses earcut, so might have some clues (I haven't investigated):

https://github.com/thomasp85/transformr/blob/master/src/triangulate.cpp#L2

virgesmith commented 6 years ago

Any links to CRAN or valgrind output I could look at?

mpadge commented 6 years ago

CRAN output is unfortunately gone, but my guess is that the valgrind issue is this:

==6316== 192 bytes in 1 blocks are still reachable in loss record 146 of 1,781      
==6316==    at 0x4C2CEDF: malloc (vg_replace_malloc.c:299)                          
==6316==    by 0x4FB0514: Rf_allocVector3 (in /usr/lib/R/lib/libR.so)               
==6316==    by 0x180823CE: _ZN4Rcpp6VectorILi13ENS_15PreserveStorageEEC2ImEET_PNS_6traits9enable_ifIXsr6traits13is_arithmeticIS4_EE5valueEvE4typeE (in /data/mega/code/forks/decido/src/decido.so)
==6316==    by 0x18081E66: earcut_cpp(Rcpp::Vector<14, Rcpp::PreserveStorage>, Rcpp::Vector<14, Rcpp::PreserveStorage>, Rcpp::Vector<13, Rcpp::PreserveStorage>) (in /data/mega/code/forks/decido/src/decido.so)
==6316==    by 0x1807B2E9: _decido_earcut_cpp (in /data/mega/code/forks/decido/src/decido.so)
==6316==    by 0x4F2DF05: ??? (in /usr/lib/R/lib/libR.so)                           
==6316==    by 0x4F75488: Rf_eval (in /usr/lib/R/lib/libR.so)                       
==6316==    by 0x4F782E6: ??? (in /usr/lib/R/lib/libR.so)                           
==6316==    by 0x4F75213: Rf_eval (in /usr/lib/R/lib/libR.so)                       
==6316==    by 0x4F769FF: ??? (in /usr/lib/R/lib/libR.so)                           
==6316==    by 0x4F6C472: ??? (in /usr/lib/R/lib/libR.so)                           
==6316==    by 0x4F74DBF: Rf_eval (in /usr/lib/R/lib/libR.so)                       

where Rcpp:PreserveStorage is the key. The cause seems to be these lines from decido.cpp:

Rcpp::IntegerVector earcut_cpp(Rcpp::NumericVector x, Rcpp::NumericVector y,
                     Rcpp::IntegerVector holes) {
  int hole_index = 0;
  int hi = holes[hole_index];
  for (int ipoint = 0; ipoint < x.length(); ipoint++) {
    if (ipoint == hi)
    ...

where the final equality comparison causes it. Note that setting some const int xlen = x.length() makes no difference. This seems to suggest to me that Rcpp is doing some kind of check on the size of holes, and that the value of hi can potentially invalidate that?

To make matters more complicated, note that there appear to be a heap of valgrind memcheck errors prompted by the bundled mapbox::earcut code, most of which reflect "Conditional jump depends on uninitialised value(s)", but several of which appear more serious. Converting all Rcpp vectors to strict C++ vectors before passing anything to mapbox::earcut does not, however, remove the error, yet in this case I do not see any way that Rcpp can complain about storage not being preserved. The mapbox::earcut code is executed in this case using a type <N> = uint32_t; but the Rcpp vectors are strictly independent of that (noting that the cause is not the final lines of decido.cpp where the (C++) integers vector is copied to the Rcpp output vector).

On the other hand, maybe all of the valgrind errors are actually innocuous, and that code is actually okay? Help hugely appreciated @virgesmith!

virgesmith commented 6 years ago

@mpadge I reckon youre on the right track. valgrind is pretty clear:

==7505== Conditional jump or move depends on uninitialised value(s)
==7505==    at 0x16D0AAF4: earcut_cpp(Rcpp::Vector<14, Rcpp::PreserveStorage>, Rcpp::Vector<14, Rcpp::PreserveStorage>, Rcpp::Vector<13, Rcpp::PreserveStorage>, Rcpp::Vector<13, Rcpp::PreserveStorage>) (decido.cpp:29)
==7505==    by 0x16D06946: _decido_earcut_cpp (RcppExports.cpp:18)

So it looks like holes might not be initialised properly. The function signature is:

IntegerVector earcut_cpp(NumericVector x, NumericVector y,
                     IntegerVector holes = -1,
                     IntegerVector numholes = 0

I have absolutely no idea what IntegerVector = -1 creates. IMHO IntegerVector should not support that kind of assignment. If you look in earcut.default it looks like it's alwayssometimes? initialised with a numeric value, which i'm guessing means: allocate a vector of this length without initialising anything in it. Then the memory is read as per valgrind above.

I'm not entirely sure whats happening in the R code but if my suspicions are correct it would be interesting to see what happens if the default -1 isnt overridden - at best it's going to allocate 4G of memory...

Suggest you don't have defaulted args at the cpp level - you can do the defaulting higher up the stack.

HTH

mpadge commented 6 years ago

Thanks @virgesmith - I removed those default args in a PR not yet merged, but that doesn't solve the issue. (The defaults are technically part of Rcpp sugar stuff, but I'm not sure exactly how they work; single values in R are just C arrays of length one, and I think Rcpp sugar handles this by allocating the C array and passing the then initialised pointer through to C++, but am not certain about that.) Whatever, it is not this potentially dodgy initialisation that is the problem here, because the code I pasted above removes that yet still leads to a malloc.

virgesmith commented 6 years ago

I think there's a flaw or two in the logic in earcut_cpp - see imminent PR that add some bounds checks and shows invalid accesses are happening (and causes test failures). On another note I tried to enable the CRAN checks (ubsan and asan) locally to no avail, its should be as simple as adding

PKG_CXXFLAGS = -fsanitize=undefined,address
PKG_LDFLAGS = -fsanitize=undefined,address

to Makevars. But sadly it isn't = see https://stackoverflow.com/questions/33178878/reproducing-cran-gcc-ubsan-test-results-at-home-on-ubuntu

mdsumner commented 6 years ago

I didn't mean for this to close.

Also, I'm very sorry but I just realized the check results are still very much available:

https://cran.r-project.org/web/checks/check_results_decido.html

(I was looking at my checks for other reasons ...)

mpadge commented 6 years ago

Hah, cool, it is exactly the error I thought it was! That really does mean that the ultimate cause is the malloc stuff from mapbox::earcut, but that still leaves the interesting question of why Rcpp would complain about PreserveStorage? I've tried everything - cloning to a const, then copying to both C++ vector and C array objects, but Rcpp for some reason still keeps the reference to the pointer that gets passed through to mapbox::earcut. That is ultimately the bit I do not get. @virgesmith?

virgesmith commented 6 years ago

I think you might be misled by the output:

#0 0x7efdd8d1a0f3 in earcut_cpp(Rcpp::Vector<14, Rcpp::PreserveStorage>, Rcpp::Vector<14, Rcpp::PreserveStorage>, Rcpp::Vector<13, Rcpp::PreserveStorage>, Rcpp::Vector<13, Rcpp::PreserveStorage>) /data/gannet/ripley/R/packages/tests-clang-SAN/decido/src/decido.cpp:29:21

Its a stack trace so its just saying the function its in, the important bit is at the end: decido.cpp:29, which was, at the time,

   if (ipoint == holes[hole_index]) {

i.e hole_index is outside the size of holes, as per my PR (which only traps, not fixes the problem)

mdsumner commented 6 years ago

I think I got it, still doing more checks. My logic about the bounds is definitely it, but I had to get back down to it to see it, and the bounds test simply avoids the inner poly push back (if I'm right).

It's enough to see the valgrind output with r-hub, but have to navigate into the artifacts to the "decido.Rcheck/tests/testthat.Rout" file.

rhub::check(platform = "fedora-gcc-devel", valgrind = TRUE)

All up I've learnt a lot, so thanks!!

mdsumner commented 6 years ago

Here is the r-hub output, no more reference to earcut_cpp or _decido_earcut_cpp

https://artifacts.r-hub.io/decido_0.2.0.tar.gz-2ff08a1653934b3dafe22d0c00ba89a7/decido.Rcheck/tests/testthat.Rout

virgesmith commented 6 years ago

looks good to me

mdsumner commented 6 years ago

Fixed and aimed for 0.2.0 release.