r-spatial / s2

Spherical Geometry Operators Using the S2 Geometry Library
https://r-spatial.github.io/s2/
71 stars 9 forks source link

CRAN packages with 'gcc11' 'additional issues' #79

Closed edzer closed 3 years ago

edzer commented 4 years ago

email from CRAN (BDR):

GCC trunk aka 11.0 is still under development but (because it may need to be used for macOS on ARM) we done a CRAN check run using it. For packages

BayesXsrc Boom CDVine DEploid DataGraph EditImputeCont OBsMD Rttf2pt1 VineCopula bliss clusternor excursions gdalcubes knor np pbdSLAP rgl ribiosUtils rlecuyer rmetasim s2 scrm tuneR

(and some others reported separately) it newly reports issues that look worth fixing and are now reported as an 'Additional issue' on the CRAN check pages for these packages. Please review these and fix or let us know if you believe they are gcc bugs.

I don't know how difficult it will be to reproduce these (although in most cases the issue is clear from the report). We compiled gcc from source but some Linux distros have builds available and someone suggested that docker images are available.

[For np I also noticed an unchecked allocation by malloc, which can fail and return NULL.]

edzer commented 4 years ago

Dear maintainer,

Please see the problems shown on https://cran.r-project.org/web/checks/check_results_s2.html.

Please correct before 2020-10-14 to safely retain your package on CRAN.

Best, -k

paleolimbot commented 4 years ago

Brutal. I will look into in the next week...I think I'll need the full stack trace for the gcc11 error, which probably originates somewhere else. I think we had to deal with the anonymous type union thing before with gcc, but will have to revisit.

edzer commented 4 years ago

That would be great! @appelmar has the same problem in gdalcubes.

appelmar commented 4 years ago

Less brutal though, I've only received the first mail without deadline.

edzer commented 4 years ago

In addition to the gcc11 warnings, we have the clang warnings:

    Found the following significant warnings:
     ../inst/include/s2/encoded_s2point_vector.h:100:5: warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]
     ../inst/include/s2/encoded_s2point_vector.h:103:5: warning: anonymous types declared in an anonymous union are an extension [-Wnested-anon-types]
edzer commented 4 years ago

There is some discussion among the package authors bitten by the gcc11 array warning that this may be a false positive. Duncan Murdoch "is pretty sure it is a compiler bug".

As Brian reminded however, the clang warning is real.

paleolimbot commented 4 years ago

Yes...this is next on my agenda! Likely tomorrow unless it turns into a rabbit hole.

paleolimbot commented 4 years ago

I think I've fixed the anonymous union warning...I couldn't figure out in a few hours how to get gcc11 and R on a Docker image, but if either of you have better luck I'd be happy to look into this from an s2-specific perspective. I think it's unlikely that google's template library btree implementation has an actual out-of-bounds error but it's certainly possible that some optimization led to something that generated a warning.

jefferis commented 3 years ago

Did any of you happen to get a docker image going with gcc11? I have a similar issue due to an upstream library (GitHub.com/jefferis/dracor) but I couldn't find an easy way to install gcc for testing on Mac. After fixing the initial error reported by BDR, CRAN have just sent me another and I feel like this could go on for a while.

paleolimbot commented 3 years ago

I didn't, and the time I spent on this was digging around in the S2 source code and trying to figure out how to download the GCC11 sources. I think you have to download the gcc11 sources and build it from source, probably in one of the rocker images, and then you can run mkdir ~/.R && echo 'CXX11=g++11' > ~/.R/Makevars' (you'll have to double check that, but that's the idea). I think this is a development snapshot (nothing is actually labelled gcc11 in downloads) but I don't know which one BDR used so you might not even be able to replicate the error. It might be worth asking in an email for the installation steps he used.

edzer commented 3 years ago

From BDR:


This week's gcc trunk update removed a number of these, including those that looked to be swapped indices.  Issues remain for

BayesXsrc Boom DEploid DataGraph EditImputeCont Matrix
RSiena Rttf2pt1 clusternor gdalcubes knor mixor np pbdSLAP rmetasim s2
scrm tuneR

(For Rttf2pt1, not those shown above.) 
paleolimbot commented 3 years ago

IT hasn't approved my request to install WSL or Docker yet, so it may have to wait until this weekend!

appelmar commented 3 years ago

I've made a first try with rocker/r-devel (see Dockerfile at https://gist.github.com/appelmar/d743eff6e4d9de696939d8aa99b0d5c4). However, building GCC takes a few hours and installing Rcpp in a container gives an error:

** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘Rcpp’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/00LOCK-Rcpp/00new/Rcpp/libs/Rcpp.so':
  /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/local/lib/R/site-library/00LOCK-Rcpp/00new/Rcpp/libs/Rcpp.so)

I'll try to find out more, maybe R itself has to be rebuilt with GCC 11?

jefferis commented 3 years ago

@appelmar Do you need to set LD_LIBRARY_PATH for an additional location with the gcc install?

appelmar commented 3 years ago

Thanks, I have fixed the Rcpp issue and now managed to get R CMD check working with gcc11 for the gdalcubes package. Interestingly though, I could not reproduce the compiler warnings yet. Looking forward to trying out the R-hub container.

jefferis commented 3 years ago

Thanks, I have fixed the Rcpp issue and now managed to get R CMD check working with gcc11 for the gdalcubes package.

Great. Out of interest what was the required fix @appelmar?

Interestingly though, I could not reproduce the compiler warnings yet.

Do you think this might be related to the changes in gcc trunk noted above by @edzer?

appelmar commented 3 years ago

It was indeed trying to use the old stdlib and setting LD_LIBRARY_PATH solved this.

I have now been able to reproduce the CRAN issues for gdalcubes (after setting CXXFLAGS and CCFLAGS in ~/.R/Makevars accordingly). Interestingly, the warnings only show up with compiler optimizations -O2 turned on. I will try this later with s2, too.

appelmar commented 3 years ago

I could reproduce the s2 warnings, too (no matter whether compiler optimization is used or not). The updated Dockerfile is here.

paleolimbot commented 3 years ago

That's brilliant...thank you!! I'll dig in as soon as I can to get these fixed in s2.

paleolimbot commented 3 years ago

Working on this now! Can reproduce with @appelmar's image (can't thank you enough!)

mkdir chk
cd chk
git clone https://github.com/r-spatial/s2.git s2
Rscript -e "install.packages('remotes'); remotes::install_deps('s2', dependencies = TRUE)"
apt-get update && apt-get instal libssl-dev
R CMD build s2
R CMD check s2_1.0.3.9000.tar.gz
Found the following significant warnings:
  ../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [33, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<int> >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [32, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<int> >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [33, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<S2ClosestEdgeQueryBase<S2MinDistance>::Result> >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [32, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<S2ClosestEdgeQueryBase<S2MinDistance>::Result> >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [33, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<const int, int> > >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [32, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<const int, int> > >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [31, 285] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<const Vector3<double>, int> > >::M [30]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [30, 285] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<const Vector3<double>, int> > >::M [30]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [33, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<S2Loop* const, std::pair<int, bool> > > >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [32, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<S2Loop* const, std::pair<int, bool> > > >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [33, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<S2ClosestEdgeQueryBase<S2MaxDistance>::Result> >::M [32]’ [-Warray-bounds]
  ../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [32, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<S2ClosestEdgeQueryBase<S2MaxDistance>::Result> >::M [32]’ [-Warray-bounds]
See ‘/chk/s2.Rcheck/00install.out’ for details.
paleolimbot commented 3 years ago

More info about at least one usage:

../inst/include/s2/util/gtl/btree.h:604:22: warning: array subscript [33, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<S2Loop* const, std::pair<int, bool> > > >::M [32]’ [-Warray-bounds]
  604 |     mutable_child(i) = c;
      |     ~~~~~~~~~~~~~~~~~^~~
In file included from /usr/local/include/c++/11.0.0/x86_64-pc-linux-gnu/bits/c++allocator.h:33,
                 from /usr/local/include/c++/11.0.0/bits/allocator.h:46,
                 from /usr/local/include/c++/11.0.0/memory:64,
                 from ../inst/include/s2/s2builderutil_s2polygon_layer.h:35,
                 from s2/s2builderutil_s2polygon_layer.cc:18:
/usr/local/include/c++/11.0.0/ext/new_allocator.h:121:48: note: referencing an object of size between 48 and 256 allocated by ‘void* operator new(std::size_t)’
  121 |         return static_cast<_Tp*>(::operator new(__n * sizeof(_Tp)));
      |                                  ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
In file included from ../inst/include/s2/util/gtl/btree_map.h:34,
                 from ../inst/include/s2/s2builderutil_s2polygon_layer.h:39,
                 from s2/s2builderutil_s2polygon_layer.cc:18:
../inst/include/s2/util/gtl/btree.h:597:58: warning: array subscript [32, 287] is outside array bounds of ‘absl::container_internal::Allocate<8, std::allocator<std::pair<S2Loop* const, std::pair<int, bool> > > >::M [32]’ [-Warray-bounds]
  597 |   btree_node *child(int i) const { return GetField<3>()[i]; }
      |                                                          ^
In file included from /usr/local/include/c++/11.0.0/x86_64-pc-linux-gnu/bits/c++allocator.h:33,
                 from /usr/local/include/c++/11.0.0/bits/allocator.h:46,
                 from /usr/local/include/c++/11.0.0/memory:64,
                 from ../inst/include/s2/s2builderutil_s2polygon_layer.h:35,
                 from s2/s2builderutil_s2polygon_layer.cc:18:
/usr/local/include/c++/11.0.0/ext/new_allocator.h:121:48: note: referencing an object of size between 48 and 256 allocated by ‘void* operator new(std::size_t)’
  121 |         return static_cast<_Tp*>(::operator new(__n * sizeof(_Tp)));
      |                                  ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
paleolimbot commented 3 years ago

Can verify that this code

https://github.com/r-spatial/s2/blob/b0f8c2b34601af62cdd9080ea7282f36531d0b8a/inst/include/s2/util/gtl/btree.h#L597-L606

hasn't changed upstream:

https://github.com/abseil/abseil-cpp/blob/1e3d25b2657228bd691ee938cfd37d487f48054b/absl/container/internal/btree.h#L662-L670

paleolimbot commented 3 years ago

(Sorry for the livestream...have to keep this up-to-date somewhere in case my 3-month old wakes up...)

I wonder if the assertion is somehow not being registered by gcc11...they've considered at least two types of invalid values here:

https://github.com/r-spatial/s2/blob/b0f8c2b34601af62cdd9080ea7282f36531d0b8a/inst/include/s2/util/gtl/btree.h#L539-L549

paleolimbot commented 3 years ago

Trying to make sense of the numbers 33-287 here, and I think the object is a btree_node, which is an alias for Layout<btree_node*, S2Loop*, std::pair<int, bool>, btree_node*). A Layout is basically a struct whose size needs to be calculated at compile time (similar to flexible length member, although these can have multiple flexible-length members and are probably faster). My sense is that either there's a bug in the Layout size calculation OR it's complicated enough that gcc11 is calculating the size wrong.

The minimum size of a btree_node<> here (from btree_map<S2Loop*, std::pair<int, bool>>) is probably 8 + 32 + (node size) * 8 + 8. With zero nodes this would mean that a sizeof(btree_node<>) is 48. With a target node size of 256 (the default), a brand new root node should be <288 bytes in size. The compiler thinks it should be <256 bytes in size.

paleolimbot commented 3 years ago

I maybe calculating the minimum size wrong...the relevant code is here:

https://github.com/r-spatial/s2/blob/b0f8c2b34601af62cdd9080ea7282f36531d0b8a/inst/include/s2/util/gtl/btree.h#L471-L492

With Layout::AllocSize() here:

https://github.com/r-spatial/s2/blob/b0f8c2b34601af62cdd9080ea7282f36531d0b8a/inst/include/s2/third_party/absl/container/internal/layout.h#L595-L599

paleolimbot commented 3 years ago

gcc may be right here...I'm making some assumptions about how may bytes are used to pad a std::pair<int, bool> (I'm assuming it gets padded to 8 bytes total), but if that's true, the numbers check out:

size_with_n_values <- function(n) {
  (1  * 8) + (4 * 8) + (n * (4 + 4))
}

node_target_values <- function(begin, end, target_node_size = 256) {
  if (begin == end) {
    begin
  } else {
    if ((size_with_n_values((begin + end) %/% 2) + 1) > target_node_size) {
      node_target_values(begin, (begin + end) / 2)
    } else {
      node_target_values((begin + end) %/% 2 + 1, end)
    }
  }
}

node_target_values(0, 256)
#> [1] 27
size_with_n_values(27)
#> [1] 256

Created on 2020-10-24 by the reprex package (v0.3.0)

paleolimbot commented 3 years ago

I've gotten some of the details wrong above: field_type is probably a uint8 and slot_type is maybe the std::pair<S2Loop* const, std::pair<int, bool> >. The minimum size of a btree_node is probably 32 due to padding, although from the warnings it looks like there's some variability there.

It's possible that the compiler isn't seeing the checks for if(!leaf()), since a leaf node has zero children and therefore any call to child() would return a pointer to memory that might not have been allocated. This is checked by GetField<> using assert():

https://github.com/r-spatial/s2/blob/b0f8c2b34601af62cdd9080ea7282f36531d0b8a/inst/include/s2/util/gtl/btree.h#L539-L549

The other possibility is that there's an off-by-one error somewhere in btree.h. The case here involves some arm-waving...the actual size of a default node is sometimes greater than 256, and could get padded to some value greater than that.

@edzer sorry for the stream-of-consciousness...any thoughts on what we should do? An obvious option is to replace any use of btree_map<> with std::unordered_map<>...apparently there's performance drawbacks to this (from comments in the btree source code). I don't see anything in the btree code that leads me to believe there would be an actual out-of-bounds array index, but I also don't have the knowledge to track down a bug if there is one. My guess is that the Layout class is doing something that the compiler doesn't understand (but I also don't understand it).

edzer commented 3 years ago

Maybe we can share these thoughts with the s2geometry authors, upstream?

paleolimbot commented 3 years ago

I'll post an issue...my reprex is here: https://gist.github.com/paleolimbot/51db5e587b5f3e57a637b6a24655fe87#gistcomment-3503457

paleolimbot commented 3 years ago

The PRs for this are still open in s2geometry but it's hard for us to test here because it's a lot of work to update the underlying version of s2. If CRAN will let us, the next release of s2 would be a good time to fix this.

paleolimbot commented 3 years ago

Just putting the link to the GCC11 bug report here:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98266

Hopefully this disappears from the CRAN checks soon!

paleolimbot commented 3 years ago

Apparently this is still getting reported in the checks (and we've been emailed about it, so it def has to be fixed before the next release)

paleolimbot commented 3 years ago

Just confirmed that this is still a problem using this Docker image: https://gist.github.com/paleolimbot/51db5e587b5f3e57a637b6a24655fe87

edzer commented 3 years ago

Debian testing has moved to gcc 11, so I've now been informed that:

Dear maintainer,

Please see the problems shown on
<https://cran.r-project.org/web/checks/check_results_s2.html>.

Please correct before 2021-09-12 to safely retain your package on CRAN.
paleolimbot commented 3 years ago

I've kept waiting for any action on s2geometry in terms of another release but I don't see much on the go (and the PR that fixed the problem was never merged and has failed tests, so we can't just pull that commit). I was hoping for an official release because that would be a clean and error-free way to fix things (we updated the internal s2 version).

edzer commented 3 years ago

Does the s2geometry PR you mention fail on debian testing?

paleolimbot commented 3 years ago

It's a lot of work to update the s2 version so I haven't tried. I only test for crossed wires in the s2 R package, not correctness for corner cases...I don't like the idea of using an untested version of s2 just to silence a CRAN warning (and potentially introduce errors).

edzer commented 3 years ago

My impression was that we'd need the same s2geometry, but with an updated absl component.

paleolimbot commented 3 years ago

Yes, but the current code was never organized to allow just absl to be updated (the PR in s2geometry both updates internal absl and makes it possible to use an updated copy of absl in the future).

edzer commented 3 years ago

I guess we'll have to go with that PR, then, or do you see alternatives?

paleolimbot commented 3 years ago

I'll ping the s2geometry repo again to see if there's anything we can do to get that PR merged and give it a few days before abandoning hope.

edzer commented 3 years ago

Good idea; do mention gcc11 is now in debian testing.

paleolimbot commented 3 years ago

I'll give it until Tuesday for any response and will then try a few options for getting this to work. The lack of maintenance upstream is a little unnerving given the complexity of the engine!