AMDComputeLibraries / morton_filter

A compressed, sparse cuckoo filter (see https://www.vldb.org/pvldb/vol11/p1041-breslow.pdf)
MIT License
85 stars 17 forks source link

Errors with clang compiler (Mac OS / LLVM) and Morton3_8 / load factor >= 0.5 #2

Closed thomasmueller closed 3 years ago

thomasmueller commented 5 years ago

When using Morton3_8 with random long keys as below, and using a load factor of >= 0.5, often I get false negatives (meaning: for an entry previously added, the method likely_contains returns false). My usage:

filter = new Morton3_8((size_t) (size / 0.50) + 64);
// in a loop, do this
uint64_t key = ...
filter->insert(key);
// in a loop, do this
uint64_t key = ...
filter->likely_contains(key);

the same key is only ever added once, and keys are random, generated using mix64, as follows:

uint64_t x = index++;
x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
x = x ^ (x >> 31);
return x;

Could you tell me what could be wrong?

abreslow commented 5 years ago

Hi Thomas,

Thank-you for being an early user of the code base. I deeply appreciate your time and effort.

Could you send me the full test code so that I can reproduce the error? What is the relative incidence of false positives (e.g., 1/1000000)? Does the problem only crop up after a specific load factor? Which version of the code are you using? How large are the filters that you are testing with?

I've been meaning to write a detailed series of unit tests. This might be a good opportunity to do that.

How urgently do you need this resolved? When would you like a response/answer by? I have an upcoming paper deadline on August 23 so if I could push this out a couple of days, that would be helpful to me.

I'm also curious why you are using the item-at-a-time interfaces. Does your use case preclude batching insertions and lookups? The batched interfaces provide much higher throughput for large filters and should be used when possible.

Thanks, Alex

thomasmueller commented 5 years ago

Here is a better test case, it is just changing benchmark.cc

int main(int argc, char** argv){
  size_t size = 10000000;
  CompressedCuckoo::Morton3_8* filter = new CompressedCuckoo::Morton3_8((size_t) (size / 0.50) + 64);
  for(size_t i = 0; i < size; i++) {
    uint64_t x = i;
    x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
    x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
    x = x ^ (x >> 31);
    if (i % 100000 == 0) {
      printf("insert i=%lld key=%lld\n", (uint64_t) i, x);
    }
    filter->insert(x);
  }
  for(size_t i = 0; i < size; i++) {
    uint64_t x = i;
    x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
    x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
    x = x ^ (x >> 31);
    if (i % 100000 == 0) {
      printf("lookup i=%lld key=%lld\n", (uint64_t) i, x);
    }
    if (!filter->likely_contains(x)) {
      printf("not found: %lld\n", x);
      return -1;
    }
  }
  printf("OK\n");
  // benchmark();
}

./benchmark

insert i=0 key=0
insert i=100000 key=-8804475191098573882
insert i=200000 key=-6885798598542860951
insert i=300000 key=5826581874738629586
insert i=400000 key=4675146885213764307
...
lookup i=0 key=0
lookup i=100000 key=-8804475191098573882
lookup i=200000 key=-6885798598542860951
lookup i=300000 key=5826581874738629586
lookup i=400000 key=4675146885213764307
...
not found: 1163028906654455684

I found that I get "bus error" if the load factor is 0.6 or higher, and with load factor of 0.5 I get often one entry not found (one false negative). False positives are as expected (I think), but I expect no false negatives as per the API contract.

How urgently do you need this resolved?

It's a good question... it would be good to have a workaround within one week. Maybe the batch insert doesn't have this problem?

I'm also curious why you are using the item-at-a-time interfaces.

For inserts, I can often use batching. For queries, my use case doesn't easily allow batching. It would require a large rewrite to use batching in some cases, and it's not possible to use it in all cases.

thomasmueller commented 5 years ago

Unfortunately, bulk insert (insert_many) seems to have the same problem. Here a test case:

int main(int argc, char** argv){
  size_t size = 10000000;
  ::std::vector<::std::uint64_t> keys(size);
  uint64_t index = 0;
  auto genrand = [&index]() {
    uint64_t x = index++;
    x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
    x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
    x = x ^ (x >> 31);
    return x;
  };
  ::std::generate(keys.begin(), keys.end(), ::std::ref(genrand));
  ::std::vector<bool> status(size);
  CompressedCuckoo::Morton3_8* filter = new CompressedCuckoo::Morton3_8((size_t) (size / 0.50) + 64);
  bool result = filter->insert_many(keys, status, size);
  printf("result = %d\n", result);
  for(size_t i = 0; i < size; i++) {
    if (!status[i]) {
      printf("status[%ld] = false\n", i);
    }
    if (i == 2273501 || keys[i] == 1163028906654455684LL) {
      printf("keys[%ld] = %lld\n", i, keys[i]);
    }
  }
  for(size_t i = 0; i < size; i++) {
    uint64_t x = i;
    x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
    x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
    x = x ^ (x >> 31);
    if (i % 100000 == 0) {
      printf("lookup i=%lld key=%lld\n", (uint64_t) i, x);
    }
    if (!filter->likely_contains(x)) {
      printf("not found: i=%ld key=%lld\n", i, x);
      return -1;
    }
  }
  printf("OK\n");
  // benchmark();
}

Output for me:

./benchmark
result = 1
keys[2273501] = 1163028906654455684
lookup i=0 key=0
lookup i=100000 key=-8804475191098573882
...
lookup i=2200000 key=-8886166559463644051
not found: i=2273501 key=1163028906654455684    
abreslow commented 5 years ago

Hi Thomas, Thanks for these error reproducers. I will try to take a look at this by Wednesday to see if I can reproduce the error and hopefully have a fix by Friday. -Alex

abreslow commented 5 years ago

Thomas,

I'm having trouble reproducing the error on my Threadripper-based system with a clone of the master branch (latest commit 235531fb746c16078e167c28898140333cc4681b). The cause of this error on your system could be a number of things. If you are using clang++, it could be that the code base is using undefined behavior that only manifests with that compiler. It could also be that you are using an old version of the code that defines Morton::Morton3_8 differently than present. I will test if the code breaks if the Block Fullness Array is enabled (in the VLDB Journal extension) or if resizing is enabled (also described in the VLDB Journal extension).

I get the following output (... for ellipsis) when using the vanilla code from the master branch: ... lookup i=9700000 key=-304284872283253451 lookup i=9800000 key=-39302948963652970 lookup i=9900000 key=2387930219243299873 OK

Here's my OS information: Linux Jurassic 4.13.0-45-generic #50~16.04.1-Ubuntu My compiler information:

g++ (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010

I add the following line to the Makefile:

$(CXX) $(INCLUDE) $(FLAGS) test.cc -o test

Here's the source for test.cc based on what you provided (placed in the morton_filter/benchmarking directory). Note that I upped 0.5 to 0.97 and it still runs fine.:

#include "morton_filter.h"
#include "./cuckoofilter/benchmarks/random.h"

int main(int argc, char** argv){
  size_t size = 10000000;
  ::std::vector<::std::uint64_t> keys(size);
  uint64_t index = 0;
  auto genrand = [&index]() {
    uint64_t x = index++;
    x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
    x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
    x = x ^ (x >> 31);
    return x;
  };
  ::std::generate(keys.begin(), keys.end(), ::std::ref(genrand));
  ::std::vector<bool> status(size);
  CompressedCuckoo::Morton3_8* filter = new CompressedCuckoo::Morton3_8((size_t) (size / 0.97) + 64);
  bool result = filter->insert_many(keys, status, size);
  printf("result = %d\n", result);
  for(size_t i = 0; i < size; i++) {
    if (!status[i]) {
      printf("status[%ld] = false\n", i);
    }
    if (i == 2273501 || keys[i] == 1163028906654455684LL) {
      printf("keys[%ld] = %lld\n", i, keys[i]);
    }
  }
  for(size_t i = 0; i < size; i++) {
    uint64_t x = i;
    x = (x ^ (x >> 30)) * 0xbf58476d1ce4e5b9L;
    x = (x ^ (x >> 27)) * 0x94d049bb133111ebL;
    x = x ^ (x >> 31);
    if (i % 100000 == 0) {
      printf("lookup i=%lld key=%lld\n", (uint64_t) i, x);
    }
    if (!filter->likely_contains(x)) {
      printf("not found: i=%ld key=%lld\n", i, x);
      return -1;
    }
  }
  printf("OK\n");
  // benchmark();
}
abreslow commented 5 years ago

I checked and enabling the Block Full Array and enabling resizing using the test code that you provided do not cause the error to manifest when the code is compiled with g++.

Note that I am not saying that the bug does not exist: it may very well exist. I am just having a hard time reproducing it as this code shows (expected value is 100):

{ for i in `seq 1 100`; do ./test; done } | awk '/OK/{count+=1}; END{print count}'

The output

100
thomasmueller commented 5 years ago

That's great news! So far I used LLVM (Mac OS). With g++ it works, well most of the time at least, with a load factor of 0.95. Sometimes I get this, when using a set size of 1 million: "corrupted size vs. prev_size". See also https://stackoverflow.com/questions/49628615/understanding-corrupted-size-vs-prev-size-glibc-error/49660707

I didn't try with older version of g++ / GCC.

So I will update the issue title to say LLVM.

thomasmueller commented 5 years ago

It is no longer blocking me. You may want to keep this issue open if you are interested in solving the problem with clang / LLVM, or maybe improve documentation. It is perfectly fine for me if you close it.

abreslow commented 5 years ago

I will keep it open. If I have more time and get access to a Mac OS system, I will try to port the main branch to support compilation and correct outputs with clang++ on Mac OS. If I recall, I use some functions from C's math library that support the constexpr qualifier when using g++ but not when using clang++. I don't know if newer versions of clang++ correct for this. When I ported to running a Clang, I saw a dip in performance. The g++-5.4 compiler gave better results, so I stuck with that. With regard to Clang, I don't recall if there were any bugs that I observed; this was a couple of months ago. However, given the ubiquity of Clang and Mac OS, I would like to support these platforms.

asl commented 4 years ago

Indeed, the code contains UB.

Running both clang and gcc with -O1 -fsanitize=undefined yields:

compressed_cuckoo_filter.h:733:35: runtime error: shift exponent 128 is too large for 128-bit type '__uint128_t' (aka 'unsigned __int128')

This is for the following line of code:

const __uint128_t mask = (one << (_fullness_counter_width * counter_index)) - one;

Changing it to:

    const unsigned shift = _fullness_counter_width * counter_index;
    const __uint128_t mask = shift == 128 ? __uint128_t(-1) : (one << shift) - one;

fixes the clang issue for me.

abreslow commented 4 years ago

@asl I would love to push your change, but I no longer have write permission to this repo. I will see if I can get one of my former colleagues to make the update. Thanks for your contribution.

jlgreathouse commented 4 years ago

For reference, you could submit a PR about this and ping me, I can accept the change.

abreslow commented 4 years ago

Sure, I will do that next week after I return to Barcelona. I am currently away from my computer on travel.


From: Joseph Greathouse notifications@github.com Sent: Friday, August 28, 2020 7:12:58 PM To: AMDComputeLibraries/morton_filter morton_filter@noreply.github.com Cc: Alex Breslow abreslow@cs.ucsd.edu; Assign assign@noreply.github.com Subject: Re: [AMDComputeLibraries/morton_filter] Errors with clang compiler (Mac OS / LLVM) and Morton3_8 / load factor >= 0.5 (#2)

For reference, you could submit a PR about this and ping me, I can accept the change.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/AMDComputeLibraries/morton_filter/issues/2*issuecomment-682944629__;Iw!!Mih3wA!UYWkSjvYi0igVpGflFnJ95bvcpFI8LxrGbEGhYrEJ1JMT0e1VFnph8S2dzHFmn8AxM8$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AAQNMOPOSBUQQFEVPVXPID3SC7QRVANCNFSM4IMMFTHA__;!!Mih3wA!UYWkSjvYi0igVpGflFnJ95bvcpFI8LxrGbEGhYrEJ1JMT0e1VFnph8S2dzHFJV2E0P4$.

abreslow commented 3 years ago

@jlgreathouse please see my pull request and its associated comments. Thanks. :-)