vmware / splinterdb

High Performance Embedded Key-Value Store
https://splinterdb.org
Apache License 2.0
686 stars 57 forks source link

Perf: splinter_test --parallel-perf seems to run endlessly, without making any progress. Ran overnight. #314

Open gapisback opened 2 years ago

gapisback commented 2 years ago

This is on /main. An invocation of the following multi-threaded performance test seems to run endlessly, and made no progress when left to run overnight. Nearly 20 cores are running at 100% CPU on this 32-core VM.

sdb-oss-test-vm:[18] $ ./bin/driver_test splinter_test --parallel-perf --num-bg-threads 20 --max-async-inflight 10 --num-pthreads 20 --db-capacity-gib 60 --db-location pll-perf.async.db
./bin/driver_test: splinterdb_build_version 22b25032
Dispatch test splinter_test
fingerprint_size: 27
Bumped up IO queue size to 416
Bumped up IO queue size to 416
Running splinter_test with 1 caches
splinter_test: splinter performance test started with 1                tables
inserting/lookups   0% complete for table 0

Some interesting stacks of few pids are shown below. (This is an OPT-build.)

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
clockcache_get_ref_internal (entry_number=<optimized out>, cc=<optimized out>) at src/clockcache.c:851
851    return cc->refcount[counter_no * cc->cfg->page_capacity + rc_number];
(gdb) bt
#0  clockcache_get_ref_internal (entry_number=<optimized out>, cc=<optimized out>) at src/clockcache.c:851
#1  clockcache_get_ref (counter_no=0, entry_number=<optimized out>, cc=0x560abe47e100) at src/clockcache.c:849
#2  clockcache_get_write (cc=0x560abe47e100, entry_number=<optimized out>) at src/clockcache.c:1139
#3  0x00007f4d3fd745f4 in cache_lock (page=0x7f4d361fcc88, cc=0x560abe47e100) at src/cache.h:305
#4  memtable_uncontended_get_claim_lock_lookup_lock (ctxt=0x7f4d00bd8040) at src/memtable.c:186
#5  0x00007f4d3fd64c96 in trunk_memtable_incorporate (tid=<optimized out>, generation=1, spl=0x7f4d3408c040)
    at src/trunk.c:3109
#6  trunk_memtable_flush_internal (generation=1, spl=0x7f4d3408c040) at src/trunk.c:3253
#7  trunk_memtable_flush_internal_virtual (arg=<optimized out>, scratch=<optimized out>) at src/trunk.c:3264
#8  0x00007f4d3fd751a7 in task_worker_thread (arg=arg@entry=0x7f4d3f8e5280) at src/task.c:293
#9  0x00007f4d3fd738d4 in task_invoke_with_hooks (func_and_args=0x560abe47a400) at src/task.c:183
#10 0x00007f4d3fb61b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#11 0x00007f4d3fbf3b80 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
(gdb) detach
Detaching from program: /home/agurajada/Code/splinterdb/bin/driver_test, process 8468

Some seconds later it changed to:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
__futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7f4d3f8e65b8) at ./nptl/futex-internal.c:57
57  ./nptl/futex-internal.c: No such file or directory.
(gdb) bt
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0,
    futex_word=0x7f4d3f8e65b8) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (cancel=true, private=0, abstime=0x0, clockid=0, expected=0,
    futex_word=0x7f4d3f8e65b8) at ./nptl/futex-internal.c:87
#2  __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x7f4d3f8e65b8,
    expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0)
    at ./nptl/futex-internal.c:139
#3  0x00007f4d3fb60ac1 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f4d3f8e6568,
    cond=0x7f4d3f8e6590) at ./nptl/pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=0x7f4d3f8e6590, mutex=0x7f4d3f8e6568) at ./nptl/pthread_cond_wait.c:627
#5  0x00007f4d3fd5b014 in platform_condvar_wait (cv=<optimized out>) at src/platform_linux/platform.c:278
#6  0x00007f4d3fd75215 in task_worker_thread (arg=arg@entry=0x7f4d3f8e6540) at src/task.c:266
#7  0x00007f4d3fd738d4 in task_invoke_with_hooks (func_and_args=0x560abe47a5c0) at src/task.c:183
#8  0x00007f4d3fb61b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#9  0x00007f4d3fbf3b80 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Stack of another process:

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
platform_sleep (ns=<optimized out>) at src/platform_linux/platform_inline.h:91
91        for (uint64 i = 0; i < ns / 5 + 1; i++) {
(gdb) bt
#0  platform_sleep (ns=<optimized out>) at src/platform_linux/platform_inline.h:91
#1  clockcache_get_read (entry_number=131, cc=0x560abe47e100) at src/clockcache.c:1048
#2  clockcache_get_internal (page=<synthetic pointer>, type=PAGE_TYPE_LOCK_NO_DATA, blocking=1, addr=1839104,
    cc=0x560abe47e100) at src/clockcache.c:2139
#3  clockcache_get (cc=0x560abe47e100, addr=1839104, blocking=1, type=PAGE_TYPE_LOCK_NO_DATA)
    at src/clockcache.c:2279
#4  0x00007f4d3fd6bd47 in trunk_lookup_async (spl=spl@entry=0x7f4d3408c040,
    key=key@entry=0x560abe49faf0 "\026\064\003n\245꛲", data=data@entry=0x560abe49fb08,
    ctxt=ctxt@entry=0x560abe49fa00) at src/trunk.c:6243
#5  0x0000560abced0ef9 in async_ctxt_process_one.constprop.0 (spl=spl@entry=0x7f4d3408c040,
    async_lookup=async_lookup@entry=0x560abe49f640, ctxt=ctxt@entry=0x560abe49fa00,
    latency_max=latency_max@entry=0x560abe48ba58, process_arg=process_arg@entry=0x7f4c5e7fbbe0,
    process_cb=0x560abcecb9c0 <verify_tuple_callback>) at tests/functional/test_async.c:126
#6  0x0000560abcec1489 in do_operation (params=params@entry=0x560abe48b958, base=0x7f4c4c0ad440,
    num_ops=num_ops@entry=7, op_offset=<optimized out>, done=done@entry=0x7f4c5e7fbd8f "", is_insert=0)
    at tests/functional/splinter_test.c:575
#7  0x0000560abcec1d59 in test_trunk_insert_lookup_thread (arg=arg@entry=0x560abe48b958)
    at tests/functional/splinter_test.c:645
#8  0x00007f4d3fd738d4 in task_invoke_with_hooks (func_and_args=0x560abe4a06c0) at src/task.c:183
#9  0x00007f4d3fb61b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#10 0x00007f4d3fbf3b80 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
gapisback commented 2 years ago

Updates: Curiously, it does not seem to be a case of out-of-disk space error. See this:

sdb-oss-test-vm:[12] $ df -kh .
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       128G   44G   85G  34% /

sdb-oss-test-vm:[13] $ du -sh *.db
128K    pll-perf.async.db

And the device specified to run the test --db-location pll-perf.async.db is only 128K, so nothing much has happened by way of allocation to make this file bigger.

ajhconway commented 2 years ago

I suspect this is some sort of deadlock. We haven't exercised the async lookup code that much recently, and I bet there's something there.

ajhconway commented 2 years ago

Bugs with async lookups and bg threads like this one I think are non-critical. We should document these features as experimental until we can work out the bugs.

carlosgarciaalvarado commented 2 years ago

I will add it to our limitations doc.

gapisback commented 1 year ago

Update: Attempted to repro this on /main @ SHA 4b6e0b1. Some of the cmdline args used in the initial test-case call are not supported any more. These are: --num-bg-threads 20 --max-async-inflight 10 --num-pthreads 20. These seem to have been reworked / removed in past change sets.

There seems to still be a problem. Tried to re-run the following with increased --cache-capacity-gib, but all 10 cores on this VM are pegged at 100%. Here are the runs I tried:

  545  ./build/release/bin/driver_test splinter_test --parallel-perf --num-normal-bg-threads 10 --num-memtable-bg-threads 10  --max-async-inflight 10 --num-pthreads 20 --db-capacity-gib 60 --db-location pll-perf.async.db
  546  ./build/release/bin/driver_test splinter_test --parallel-perf --num-normal-bg-threads 10 --num-memtable-bg-threads 10  --num-pthreads 20 --db-capacity-gib 60 --db-location pll-perf.async.db
  547  ./build/release/bin/driver_test splinter_test --parallel-perf --num-normal-bg-threads 10 --num-memtable-bg-threads 10  --db-capacity-gib 60 --db-location pll-perf.async.db
  548  ./build/release/bin/driver_test splinter_test --parallel-perf --num-normal-bg-threads 10 --num-memtable-bg-threads 10  --db-capacity-gib 60 --db-location pll-perf.async.db --cache-capacity-gib 2
  549  ./build/release/bin/driver_test splinter_test --parallel-perf --num-normal-bg-threads 10 --num-memtable-bg-threads 10  --db-capacity-gib 60 --db-location pll-perf.async.db --cache-capacity-gib 4

The issue may be one of incorrect --cmd-line args configuring threads and so on. Still -- it's odd that things are pegged and the test seems to not make any progress. (Just tried for a few mins.)

Here's a perf top stack when this run happened:

Samples: 5M of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 1845764265 lost: 0/0 drop: 0/0
Overhead  Shared Object         Symbol
  90.38%  libsplinterdb.so      [.] clockcache_get
   9.08%  libsplinterdb.so      [.] clockcache_get_write
   0.32%  libsplinterdb.so      [.] clockcache_try_get_read.constprop.0
   0.15%  ld-linux-x86-64.so.2  [.] __tls_get_addr
   0.04%  libsplinterdb.so      [.] 0x000000000000d140
   0.01%  [kernel]              [k] zap_pte_range
   0.00%  [kernel]              [k] free_unref_page_list
   0.00%  [kernel]              [k] __lock_text_start