vmware / splinterdb

High Performance Embedded Key-Value Store
https://splinterdb.org
Apache License 2.0
680 stars 57 forks source link

Debug assert fails in routing_filter_add(), "(index_no / addrs_per_page < pages_per_extent)": Causes large inserts workload to fail. #560

Open gapisback opened 1 year ago

gapisback commented 1 year ago

This specific problem was encountered during benchmarking of single-client inserting 20M rows (in PG-SplinterDB integration code base).


Update (agurajada; 4/2023) After deeper investigations of variations of this repro, the basic issue seems to be that there is some instability in the library in /main when using small key-value pairs. After some experimentation, it appears that we can stably insert 20+M rows, using single / multiple threads, when key=4 bytes and value >= 20 bytes. For smaller k/v pair sizes, there are different forms of instabilities seen. We really need a comprehensive test-suite that can exercise these basic insert workloads for diff # of rows inserts, # of clients, and varying combinations of K/V pair sizes.


This has been reproduced using standalone test script off of /main @ SHA 89f09b3.

Branch: agurajada/560-rf-add-bug (Has another commit, which fixes another issue, partially tracked by the failures by repros attempted for issue #458. You need that fix in order to go further along to repro this failure.)

The failure is seen in this test large_inserts_stress_test --num-inserts 20000000 test_560_seq_htobe32_key_random_6byte_values_inserts from this branch:

Running 1 CTests, suite name 'large_inserts_stress', test case 'test_560_seq_htobe32_key_random_6byte_values_inserts'.
TEST 1/1 large_inserts_stress:test_seq_htobe32_key_random_6byte_values_inserts OS-pid=426343, OS-tid=426343, Thread-ID=0, Assertion failed at src/routing_filter.c:586:routing_filter_add(): "(index_no / addrs_per_page < pages_per_extent)". index_no=16384, addrs_per_page=512, (index_no / addrs_per_page)=32, pages_per_extent=32
Aborted (core dumped)

With release build, you will get a seg-fault a few lines later.

The test case test_560_seq_htobe32_key_random_6byte_values_inserts() has the exact conditions required to trigger this bug. There are other test cases in this test file that invoke different combinations of sequential / random key inserts, and they all seem to succeed.

gapisback commented 1 year ago

Have narrowed down the repro as follows: Re-applied relevant commits from agurajada/560-rf-add-bug to commits off of /main.

  1. agurajada/rf-add-f0af570-baseline : Branched off of /main @ SHA f0af570 - Repro succeeds
  2. agurajada/rf-add-fc897e4c-fail : Branched off of /main @ SHA fc897e4c - Repro fails

Debugging ...

Updated: (4/1/2023):

I pushed branch agurajada/rf-add-fc897e4c-fail, which adds couple more variations of the basic test case:

You have to run these with large # of inserts, say 20M e.g.

$ ./build/release/bin/unit/large_inserts_stress_test --verbose-progress --num-inserts 20000000 test_560_seq_host_endian32_key_random_5byte_values_inserts
rosenhouse commented 1 year ago

Here's a really easy diff against main to reproduce this:

$ git diff
diff --git a/tests/unit/splinterdb_stress_test.c b/tests/unit/splinterdb_stress_test.c
index 87cf659..32d7470 100644
--- a/tests/unit/splinterdb_stress_test.c
+++ b/tests/unit/splinterdb_stress_test.c
@@ -18,8 +18,8 @@
 #include "../functional/random.h"
 #include "ctest.h" // This is required for all test-case files.

-#define TEST_KEY_SIZE   20
-#define TEST_VALUE_SIZE 116
+#define TEST_KEY_SIZE   4
+#define TEST_VALUE_SIZE 6

 // Function Prototypes
 static void *
@@ -73,7 +73,7 @@ CTEST2(splinterdb_stress, test_random_inserts_concurrent)
    ASSERT_TRUE(random_data >= 0);

    worker_config wcfg = {
-      .num_inserts = 1000 * 1000,
+      .num_inserts = 20 * 1000 * 1000,
       .random_data = random_data,
       .kvsb        = data->kvsb,
    };
rosenhouse commented 1 year ago

We did some more experimentation today and it appears tuples with 4b key, 20b value will pass, but smaller will not.

gapisback commented 1 year ago

Here is some more evidence of the behaviours we see for this existing stress-test when run off of /main @SHA 89f09b3:

For a single-thread execution:

 22 #define TEST_KEY_SIZE   4
 23
  27 #define TEST_VALUE_SIZE 16

#define num_threads 1

 56    data->cfg           = (splinterdb_config){.filename   = TEST_DB_NAME,
 57                                              .cache_size = 1000 * Mega,
 58                                              .disk_size  = 30000 * Mega,
 59                                              .data_cfg   = &data->default_data_config,
 60                                              .queue_scale_percent = 100};

The test progresses to about: Thread 140002820638272 has completed 10200000 inserts and then hangs. The stack looks like:

Samples: 1M of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 16012963972 lost: 0/0 drop: 0/0
Overhead  Shared Object         Symbol
  46.09%  libsplinterdb.so      [.] clockcache_try_get_read.constprop.0
  43.10%  libsplinterdb.so      [.] clockcache_get
   5.85%  ld-linux-x86-64.so.2  [.] __tls_get_addr
   1.99%  libsplinterdb.so      [.] 0x000000000000d150
   1.02%  libsplinterdb.so      [.] 0x000000000000cf60
   0.99%  libsplinterdb.so      [.] rc_allocator_get_config
   0.96%  libsplinterdb.so      [.] rc_allocator_get_config_virtual
   0.00%  [kernel]              [k] exit_to_user_mode_loop

The same test with the same above config will pass for this combo (again single thread): TEST_KEY_SIZE 4; TEST_VALUE_SIZE 20

--- Re-run with 4-threads, key=4 bytes, value=16 bytes:

This will run for a while till here:

sdb-fdb-build:[43] $ VERBOSE=6 ./build/release/bin/unit/splinterdb_stress_test test_random_inserts_concurrent
Running 1 CTests, suite name 'splinterdb_stress', test case 'test_random_inserts_concurrent'.
TEST 1/1 splinterdb_stress:test_random_inserts_concurrent Waiting for 4 worker threads ...
  Thread[0] ID=140504240404032
  Thread[1] ID=140504232011328
  Thread[2] ID=140504223618624
  Thread[3] ID=140504215225920
Writing lots of data from thread 140504232011328
[...]
Thread 140504240404032 has completed 3300000 inserts
Thread 140504215225920 has completed 3300000 inserts
Thread 140504223618624 has completed 3600000 inserts

And then will hang in this stack:

Samples: 93K of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 20538871086 lost: 0/0 drop: 0/0
Overhead  Shared Object         Symbol
  73.76%  libsplinterdb.so      [.] memtable_maybe_rotate_and_get_insert_lock
  11.95%  libsplinterdb.so      [.] clockcache_try_get_read.constprop.0
  10.88%  libsplinterdb.so      [.] clockcache_get
   1.59%  ld-linux-x86-64.so.2  [.] __tls_get_addr
   0.53%  libsplinterdb.so      [.] 0x000000000000d150
   0.38%  libsplinterdb.so      [.] clockcache_unget
   0.29%  libsplinterdb.so      [.] rc_allocator_get_config_virtual
   0.27%  libsplinterdb.so      [.] 0x000000000000cf60
   0.26%  libsplinterdb.so      [.] rc_allocator_get_config