microsoft / ALEX

A library for building an in-memory, Adaptive Learned indEX
MIT License
662 stars 114 forks source link

Keys out of order after insert/delete/lookup operations #3

Closed shachaf closed 4 years ago

shachaf commented 4 years ago

This is the original problem I ran into before #1, but it exists even after applying #2 (in its current state). It only happens when applying lookups. Test program:

$ cat alex-test.cc
#include <stdio.h>

#include "ALEX/src/core/alex_map.h"

int main(int argc, char **argv) {
  alex::AlexMap<uint64_t, uint64_t> map;

  FILE *f = fopen("alex-test-input-lookups.txt", "r");
  while (1) {
    char buf[100];
    char *line = fgets(buf, sizeof buf, f);
    if (!line) break;

    uint64_t key;
    if (line[0] == 'a') {
      sscanf(line, "a %lu", &key);
      printf("assign %lu\n", key);
      map[key] = 0;
    } else if (line[0] == 'd') {
      sscanf(line, "d %lu", &key);
      auto it = map.find(key);
      if (!it.is_end()) {
        printf("delete %lu\n", key);
        map.erase(it);
      } else {
        printf("!found %lu\n", key);
      }
    } else if (line[0] == 'l') {
      sscanf(line, "l %lu", &key);
      auto it = map.find(key);
      printf("lookup %lu (%s)\n", key, it.is_end() ? "not found" : "found");
    }
  }

  uint64_t last_key = 0;
  for (auto it = map.begin(); !it.is_end(); ++it) {
    uint64_t key = (*it).first;
    if (key < last_key) {
      printf("out of order!\n");
      printf("last %020lu\n", last_key);
      printf("key  %020lu\n", key);
    }
    last_key = key;
  }

  return 0;
}
$ clang++ -g alex-test.cc -o build/alex-test -march=haswell && ./build/alex-test | tail -n3
out of order!
last 14654630671376060525
key  14623995723005132922

The test file is at http://slbkbs.org/tmp/alex-test-input-lookups.txt

jialinding commented 4 years ago

Thanks for bringing both of these issues to our attention! I made another change to pull request https://github.com/microsoft/ALEX/pull/2. If you use the latest code there, your test program should work on your input.

shachaf commented 4 years ago

Thanks for the update. That seems to fix the issue above, but I'm now getting SIGSEGV with this test file: https://slbkbs.org/tmp/alex-test-input-3.txt

Stack trace:

#0  0x000000000040aa3c in alex::Alex<unsigned long, unsigned long, alex::AlexCompare, std::allocator<std::pair<unsigned long, unsigned long> >, false>::expand_root (
    this=0x7fffffffdaf0, expand_left=false) at ./ALEX/src/core/alex.h:1522
#1  0x00000000004084e9 in alex::Alex<unsigned long, unsigned long, alex::AlexCompare, std::allocator<std::pair<unsigned long, unsigned long> >, false>::insert (this=0x7fffffffdaf0,
    key=@0x7fffffffda60: 18445859036059093609, payload=@0x7fffffffd938: 0) at ./ALEX/src/core/alex.h:1113
#2  0x000000000040346d in alex::AlexMap<unsigned long, unsigned long, alex::AlexCompare, std::allocator<std::pair<unsigned long, unsigned long> > >::operator[] (
    this=0x7fffffffdaf0, key=@0x7fffffffda60: 18445859036059093609) at ./ALEX/src/core/alex_map.h:128
#3  0x0000000000401643 in main (argc=2, argv=0x7fffffffdcc8) at alex-test.cc:18
jialinding commented 4 years ago

Thanks for the follow up! Could you try the latest code?

shachaf commented 4 years ago

It doesn't seem to crash now, so this looks better. I've been running https://github.com/petersn/btreetests to test it, and now the test goes up to >15GB memory use and starts using swap before I kill it. That's probably a separate bug -- maybe it's worth you testing it directly.

jialinding commented 4 years ago

Thanks for the reference! I ran ALEX on that testing framework and I'm also seeing unexpected behavior, so we will definitely follow up on this issue.