norlab-ulaval / libnabo

A fast K Nearest Neighbor library for low-dimensional spaces
http://norlab-ulaval.github.io/libnabo/
BSD 3-Clause "New" or "Revised" License
440 stars 144 forks source link

undefined behaviour in kdtree_cpu.cpp caught by -fsanitize=undefined #16

Closed jefferis closed 10 years ago

jefferis commented 10 years ago

I have recently wrapped libnabo in a package for R (here, source here). Although the package was accepted for distribution by CRAN, it was quickly pointed out to me that there is at least one place in libnabo that appears to depend on undefined behaviour by the compiler. Specifically at line 245 in kdtree_cpu.cpp:

const uint64_t maxNodeCount((1 << (32-dimBitCount)) - 1);

when the dimension of coordinates is 1 and dimBitCount=1 that will imply:

1 << 31

which is not defined (but will evaluate to 2147483647 in most cases). Obviously dim=0 will also give an error but that should never happen. I think that the fix is as simple as:

const uint64_t maxNodeCount((1U << (32-dimBitCount)) - 1);

but I am no expert, so I offer that for discussion rather than immediately as a pull request.

P.S. I had some trouble debugging this within the R package so I ended up making a small test program:

#include <iostream>

// Typically:
// clang++ -fsanitize=undefined -o ubsan_test ubsan_test.cpp
// Currently on mac with Apple clang:
// clang++ -fsanitize=undefined-trap -fsanitize-undefined-trap-on-error -o ubsan_test ubsan_test.cpp

uint32_t getStorageBitCount(uint32_t v) {
  for (uint32_t  i = 0; i < 64; ++i)
  {
    if (v == 0)
    return i;
    v >>= 1;
  }
  return 64;
}

int main() {
  for (int dim = 1L; dim <10L; dim++) {
    std::cout << "dim: " << dim << std::endl;
    const uint32_t dimBitCount =  getStorageBitCount(dim);
    std::cout << "dimBitCount: " << dimBitCount << std::endl;
    uint64_t maxNodeCount = (1 << (32-dimBitCount)) - 1;
    std::cout << "maxNodeCount: " << maxNodeCount << std::endl << std::endl;
  }
  return 0;
}
simonlynen commented 10 years ago

@jefferis Thank you for chasing this down and the proposal for a fix. Yes using a constant of correct width seems the correct fix for this issue. While using the unsigned specifier might work, constants of type uint64_t have the following format:

static constexpr uint64_t some_constant = 0x0LL;
jefferis commented 10 years ago

Is it possible to do this without relying on constexpr and therefore on C++11? At least for me there is only one other place in the libnabo code that I have noticed depending on C++11 (an enum terminated by a comma).

jefferis commented 10 years ago

Incidentally should that be?

static constexpr uint64_t some_constant = 0x0LLU;

or even ULL!

simonlynen commented 10 years ago

@jefferis Sorry. Yes absolutely no need for C++11 constexpr, this was just automatic code writing.

Would you mind making a pull-request for that so it gets linked to this discussion?

Thanks again!