Debug assert fails in routing_filter_add(), "(index_no / addrs_per_page < pages_per_extent)": Causes large inserts workload to fail.

This specific problem was encountered during benchmarking of single-client inserting 20M rows (in PG-SplinterDB integration code base).

Update (agurajada; 4/2023) After deeper investigations of variations of this repro, the basic issue seems to be that there is some instability in the library in /main when using small key-value pairs. After some experimentation, it appears that we can stably insert 20+M rows, using single / multiple threads, when key=4 bytes and value >= 20 bytes. For smaller k/v pair sizes, there are different forms of instabilities seen. We really need a comprehensive test-suite that can exercise these basic insert workloads for diff # of rows inserts, # of clients, and varying combinations of K/V pair sizes.

This has been reproduced using standalone test script off of /main @ SHA 89f09b3.

Branch: agurajada/560-rf-add-bug (Has another commit, which fixes another issue, partially tracked by the failures by repros attempted for issue #458. You need that fix in order to go further along to repro this failure.)

The failure is seen in this test large_inserts_stress_test --num-inserts 20000000 test_560_seq_htobe32_key_random_6byte_values_inserts from this branch:

Running 1 CTests, suite name 'large_inserts_stress', test case 'test_560_seq_htobe32_key_random_6byte_values_inserts'.
TEST 1/1 large_inserts_stress:test_seq_htobe32_key_random_6byte_values_inserts OS-pid=426343, OS-tid=426343, Thread-ID=0, Assertion failed at src/routing_filter.c:586:routing_filter_add(): "(index_no / addrs_per_page < pages_per_extent)". index_no=16384, addrs_per_page=512, (index_no / addrs_per_page)=32, pages_per_extent=32
Aborted (core dumped)

With release build, you will get a seg-fault a few lines later.

The test case test_560_seq_htobe32_key_random_6byte_values_inserts() has the exact conditions required to trigger this bug. There are other test cases in this test file that invoke different combinations of sequential / random key inserts, and they all seem to succeed.

Have narrowed down the repro as follows: Re-applied relevant commits from agurajada/560-rf-add-bug to commits off of /main.

agurajada/rf-add-f0af570-baseline : Branched off of /main @ SHA f0af570 - Repro succeeds
agurajada/rf-add-fc897e4c-fail : Branched off of /main @ SHA fc897e4c - Repro fails

Debugging ...

Updated: (4/1/2023):

I pushed branch agurajada/rf-add-fc897e4c-fail, which adds couple more variations of the basic test case:

test_560_seq_host_endian32_key_random_5byte_values_inserts: Fails
test_560_seq_host_endian32_key_random_8byte_values_inserts: Passes

You have to run these with large # of inserts, say 20M e.g.

$ ./build/release/bin/unit/large_inserts_stress_test --verbose-progress --num-inserts 20000000 test_560_seq_host_endian32_key_random_5byte_values_inserts

Here's a really easy diff against main to reproduce this:

$ git diff
diff --git a/tests/unit/splinterdb_stress_test.c b/tests/unit/splinterdb_stress_test.c
index 87cf659..32d7470 100644
--- a/tests/unit/splinterdb_stress_test.c
+++ b/tests/unit/splinterdb_stress_test.c
@@ -18,8 +18,8 @@
 #include "../functional/random.h"
 #include "ctest.h" // This is required for all test-case files.

-#define TEST_KEY_SIZE   20
-#define TEST_VALUE_SIZE 116
+#define TEST_KEY_SIZE   4
+#define TEST_VALUE_SIZE 6

 // Function Prototypes
 static void *
@@ -73,7 +73,7 @@ CTEST2(splinterdb_stress, test_random_inserts_concurrent)
    ASSERT_TRUE(random_data >= 0);

    worker_config wcfg = {
-      .num_inserts = 1000 * 1000,
+      .num_inserts = 20 * 1000 * 1000,
       .random_data = random_data,
       .kvsb        = data->kvsb,
    };

We did some more experimentation today and it appears tuples with 4b key, 20b value will pass, but smaller will not.

Here is some more evidence of the behaviours we see for this existing stress-test when run off of /main @SHA 89f09b3:

For a single-thread execution:

 22 #define TEST_KEY_SIZE   4
 23
  27 #define TEST_VALUE_SIZE 16

#define num_threads 1

 56    data->cfg           = (splinterdb_config){.filename   = TEST_DB_NAME,
 57                                              .cache_size = 1000 * Mega,
 58                                              .disk_size  = 30000 * Mega,
 59                                              .data_cfg   = &data->default_data_config,
 60                                              .queue_scale_percent = 100};

The test progresses to about: Thread 140002820638272 has completed 10200000 inserts and then hangs. The stack looks like:

Samples: 1M of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 16012963972 lost: 0/0 drop: 0/0
Overhead  Shared Object         Symbol
  46.09%  libsplinterdb.so      [.] clockcache_try_get_read.constprop.0
  43.10%  libsplinterdb.so      [.] clockcache_get
   5.85%  ld-linux-x86-64.so.2  [.] __tls_get_addr
   1.99%  libsplinterdb.so      [.] 0x000000000000d150
   1.02%  libsplinterdb.so      [.] 0x000000000000cf60
   0.99%  libsplinterdb.so      [.] rc_allocator_get_config
   0.96%  libsplinterdb.so      [.] rc_allocator_get_config_virtual
   0.00%  [kernel]              [k] exit_to_user_mode_loop

The same test with the same above config will pass for this combo (again single thread): TEST_KEY_SIZE 4; TEST_VALUE_SIZE 20

--- Re-run with 4-threads, key=4 bytes, value=16 bytes:

This will run for a while till here:

sdb-fdb-build:[43] $ VERBOSE=6 ./build/release/bin/unit/splinterdb_stress_test test_random_inserts_concurrent
Running 1 CTests, suite name 'splinterdb_stress', test case 'test_random_inserts_concurrent'.
TEST 1/1 splinterdb_stress:test_random_inserts_concurrent Waiting for 4 worker threads ...
  Thread[0] ID=140504240404032
  Thread[1] ID=140504232011328
  Thread[2] ID=140504223618624
  Thread[3] ID=140504215225920
Writing lots of data from thread 140504232011328
[...]
Thread 140504240404032 has completed 3300000 inserts
Thread 140504215225920 has completed 3300000 inserts
Thread 140504223618624 has completed 3600000 inserts

And then will hang in this stack:

Samples: 93K of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 20538871086 lost: 0/0 drop: 0/0
Overhead  Shared Object         Symbol
  73.76%  libsplinterdb.so      [.] memtable_maybe_rotate_and_get_insert_lock
  11.95%  libsplinterdb.so      [.] clockcache_try_get_read.constprop.0
  10.88%  libsplinterdb.so      [.] clockcache_get
   1.59%  ld-linux-x86-64.so.2  [.] __tls_get_addr
   0.53%  libsplinterdb.so      [.] 0x000000000000d150
   0.38%  libsplinterdb.so      [.] clockcache_unget
   0.29%  libsplinterdb.so      [.] rc_allocator_get_config_virtual
   0.27%  libsplinterdb.so      [.] 0x000000000000cf60
   0.26%  libsplinterdb.so      [.] rc_allocator_get_config

vmware / splinterdb

Debug assert fails in routing_filter_add(), "(index_no / addrs_per_page < pages_per_extent)": Causes large inserts workload to fail. #560