clockworklabs / SpacetimeDB

Multiplayer at the speed of light
https://spacetimedb.com
Other
4.15k stars 100 forks source link

core: Integrate new commitlog + durability #926

Closed kim closed 1 month ago

kim commented 2 months ago

Description of Changes

This patch attempts to integrate the new commitlog with the minimum changes.

Most of the diff comes from deletions of the legacy log and the need to adjust tests due to the requirement for a tokio runtime when a durable database is used in tests.

The "meat" of the patch are the RelationalDB constructors, RelationalDB::commit_tx, and the replay logic in locking_tx_datastore.

While DataKey is gone, there is still some redundant data being passed around, which will be addressed in the follow-up patch.

API and ABI breaking changes

Not compatible with logs written using the previous implementation.

Expected complexity level and risk

5

kazimuth commented 2 months ago

benchmarks please

github-actions[bot] commented 2 months ago
Criterion benchmark results # Criterion benchmark report **YOU SHOULD PROBABLY IGNORE THESE RESULTS.** Criterion is a wall time based benchmarking system that is extremely noisy when run on CI. We collect these results for longitudinal analysis, but they are not reliable for comparing individual PRs. Go look at the callgrind report instead. ## empty | db | on disk | new latency | old latency | new throughput | old throughput | |----------|---------|---------------|---------------|----------------|----------------| | sqlite | 💿 | - | 428.0±2.88ns | - | - | | sqlite | 🧠 | - | 422.6±2.30ns | - | - | | stdb_raw | 💿 | 818.6±3.81ns | 711.8±0.52ns | - | - | | stdb_raw | 🧠 | 704.0±1.67ns | 682.7±0.95ns | - | - | ## insert_1 | db | on disk | schema | indices | preload | new latency | old latency | new throughput | old throughput | |----|---------|--------|---------|---------|-------------|-------------|----------------|----------------| ## insert_bulk | db | on disk | schema | indices | preload | count | new latency | old latency | new throughput | old throughput | |----------|---------|-------------|-------------------|---------|-------|----------------|-----------------|----------------|----------------| | sqlite | 💿 | u32_u64_str | btree_each_column | 2048 | 256 | - | 514.1±0.59µs | - | 1945 tx/sec | | sqlite | 💿 | u32_u64_str | unique_0 | 2048 | 256 | - | 138.5±1.13µs | - | 7.0 Ktx/sec | | sqlite | 💿 | u32_u64_u64 | btree_each_column | 2048 | 256 | - | 431.1±30.20µs | - | 2.3 Ktx/sec | | sqlite | 💿 | u32_u64_u64 | unique_0 | 2048 | 256 | - | 125.3±0.73µs | - | 7.8 Ktx/sec | | sqlite | 🧠 | u32_u64_str | btree_each_column | 2048 | 256 | - | 446.5±0.68µs | - | 2.2 Ktx/sec | | sqlite | 🧠 | u32_u64_str | unique_0 | 2048 | 256 | - | 122.5±0.18µs | - | 8.0 Ktx/sec | | sqlite | 🧠 | u32_u64_u64 | btree_each_column | 2048 | 256 | - | 367.6±1.29µs | - | 2.7 Ktx/sec | | sqlite | 🧠 | u32_u64_u64 | unique_0 | 2048 | 256 | - | 106.0±1.09µs | - | 9.2 Ktx/sec | | stdb_raw | 💿 | u32_u64_str | btree_each_column | 2048 | 256 | 540.2±1.46µs | 721.6±1.06µs | 1851 tx/sec | 1385 tx/sec | | stdb_raw | 💿 | u32_u64_str | unique_0 | 2048 | 256 | 441.4±0.64µs | 622.3±0.91µs | 2.2 Ktx/sec | 1607 tx/sec | | stdb_raw | 💿 | u32_u64_u64 | btree_each_column | 2048 | 256 | 470.1±2.57µs | 431.7±0.35µs | 2.1 Ktx/sec | 2.3 Ktx/sec | | stdb_raw | 💿 | u32_u64_u64 | unique_0 | 2048 | 256 | 422.2±1.95µs | 388.3±0.44µs | 2.3 Ktx/sec | 2.5 Ktx/sec | | stdb_raw | 🧠 | u32_u64_str | btree_each_column | 2048 | 256 | 532.6±0.24µs | 495.8±0.47µs | 1877 tx/sec | 2016 tx/sec | | stdb_raw | 🧠 | u32_u64_str | unique_0 | 2048 | 256 | 435.2±0.49µs | 404.2±0.36µs | 2.2 Ktx/sec | 2.4 Ktx/sec | | stdb_raw | 🧠 | u32_u64_u64 | btree_each_column | 2048 | 256 | 458.8±0.52µs | 326.8±0.19µs | 2.1 Ktx/sec | 3.0 Ktx/sec | | stdb_raw | 🧠 | u32_u64_u64 | unique_0 | 2048 | 256 | 417.0±0.36µs | 289.1±0.49µs | 2.3 Ktx/sec | 3.4 Ktx/sec | ## iterate | db | on disk | schema | indices | new latency | old latency | new throughput | old throughput | |----------|---------|-------------|----------|---------------|---------------|----------------|----------------| | sqlite | 💿 | u32_u64_str | unique_0 | - | 20.1±0.05µs | - | 48.5 Ktx/sec | | sqlite | 💿 | u32_u64_u64 | unique_0 | - | 19.1±0.08µs | - | 51.1 Ktx/sec | | sqlite | 🧠 | u32_u64_str | unique_0 | - | 19.1±0.20µs | - | 51.2 Ktx/sec | | sqlite | 🧠 | u32_u64_u64 | unique_0 | - | 17.8±0.11µs | - | 54.7 Ktx/sec | | stdb_raw | 💿 | u32_u64_str | unique_0 | 18.9±0.02µs | 18.7±0.00µs | 51.6 Ktx/sec | 52.3 Ktx/sec | | stdb_raw | 💿 | u32_u64_u64 | unique_0 | 16.2±0.05µs | 15.8±0.00µs | 60.4 Ktx/sec | 61.6 Ktx/sec | | stdb_raw | 🧠 | u32_u64_str | unique_0 | 18.7±0.01µs | 18.6±0.00µs | 52.3 Ktx/sec | 52.4 Ktx/sec | | stdb_raw | 🧠 | u32_u64_u64 | unique_0 | 15.8±0.00µs | 15.8±0.00µs | 61.7 Ktx/sec | 61.8 Ktx/sec | ## find_unique | db | on disk | key type | preload | new latency | old latency | new throughput | old throughput | |----|---------|----------|---------|-------------|-------------|----------------|----------------| ## filter | db | on disk | key type | index strategy | load | count | new latency | old latency | new throughput | old throughput | |----------|---------|----------|----------------|------|-------|--------------|---------------|----------------|----------------| | sqlite | 💿 | string | index | 2048 | 256 | - | 66.7±0.15µs | - | 14.6 Ktx/sec | | sqlite | 💿 | u64 | index | 2048 | 256 | - | 63.1±0.10µs | - | 15.5 Ktx/sec | | sqlite | 🧠 | string | index | 2048 | 256 | - | 63.7±0.21µs | - | 15.3 Ktx/sec | | sqlite | 🧠 | u64 | index | 2048 | 256 | - | 58.1±0.19µs | - | 16.8 Ktx/sec | | stdb_raw | 💿 | string | index | 2048 | 256 | 5.9±0.14µs | 5.7±0.00µs | 166.3 Ktx/sec | 171.4 Ktx/sec | | stdb_raw | 💿 | u64 | index | 2048 | 256 | 5.9±0.02µs | 5.5±0.00µs | 166.8 Ktx/sec | 177.7 Ktx/sec | | stdb_raw | 🧠 | string | index | 2048 | 256 | 5.6±0.00µs | 5.7±0.00µs | 175.4 Ktx/sec | 172.8 Ktx/sec | | stdb_raw | 🧠 | u64 | index | 2048 | 256 | 5.5±0.00µs | 5.5±0.00µs | 178.2 Ktx/sec | 178.6 Ktx/sec | ## serialize | schema | format | count | new latency | old latency | new throughput | old throughput | |-------------|---------------|-------|-----------------|-----------------|----------------|----------------| | u32_u64_str | bsatn | 100 | 2.5±0.03µs | 2.5±0.04µs | 38.3 Mtx/sec | 38.6 Mtx/sec | | u32_u64_str | json | 100 | 4.9±0.10µs | 5.0±0.02µs | 19.3 Mtx/sec | 19.0 Mtx/sec | | u32_u64_str | product_value | 100 | 678.8±0.32ns | 647.1±0.61ns | 140.5 Mtx/sec | 147.4 Mtx/sec | | u32_u64_u64 | bsatn | 100 | 1797.3±42.28ns | 1718.8±34.50ns | 53.1 Mtx/sec | 55.5 Mtx/sec | | u32_u64_u64 | json | 100 | 3.2±0.06µs | 3.2±0.04µs | 30.2 Mtx/sec | 29.7 Mtx/sec | | u32_u64_u64 | product_value | 100 | 560.0±1.89ns | 604.8±1.23ns | 170.3 Mtx/sec | 157.7 Mtx/sec | ## stdb_module_large_arguments | arg size | new latency | old latency | new throughput | old throughput | |----------|----------------|---------------|----------------|----------------| | 64KiB | 84.2±11.13µs | 77.7±6.74µs | - | - | ## stdb_module_print_bulk | line count | new latency | old latency | new throughput | old throughput | |------------|-----------------|-----------------|----------------|----------------| | 1 | 41.5±5.46µs | 39.5±3.89µs | - | - | | 100 | 355.6±30.23µs | 353.9±33.31µs | - | - | | 1000 | 2.6±0.49ms | 2.4±0.48ms | - | - | ## remaining | name | new latency | old latency | new throughput | old throughput | |-----------------------------------------------------------------------|----------------|-----------------|----------------|----------------| | sqlite/💿/update_bulk/u32_u64_str/unique_0/load=2048/count=256 | - | 46.7±0.16µs | - | 20.9 Ktx/sec | | sqlite/💿/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 | - | 41.7±0.19µs | - | 23.4 Ktx/sec | | sqlite/🧠/update_bulk/u32_u64_str/unique_0/load=2048/count=256 | - | 39.6±0.36µs | - | 24.6 Ktx/sec | | sqlite/🧠/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 | - | 36.4±0.20µs | - | 26.9 Ktx/sec | | stdb_module/💿/update_bulk/u32_u64_str/unique_0/load=2048/count=256 | 2.6±0.02ms | 3.2±0.00ms | 382 tx/sec | 316 tx/sec | | stdb_module/💿/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 | 2.3±0.01ms | 2.2±0.00ms | 426 tx/sec | 451 tx/sec | | stdb_raw/💿/update_bulk/u32_u64_str/unique_0/load=2048/count=256 | 835.9±7.06µs | 1120.4±1.97µs | 1196 tx/sec | 892 tx/sec | | stdb_raw/💿/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 | 800.6±1.73µs | 760.8±1.09µs | 1249 tx/sec | 1314 tx/sec | | stdb_raw/🧠/update_bulk/u32_u64_str/unique_0/load=2048/count=256 | 816.5±0.47µs | 792.8±0.40µs | 1224 tx/sec | 1261 tx/sec | | stdb_raw/🧠/update_bulk/u32_u64_u64/unique_0/load=2048/count=256 | 787.4±0.34µs | 566.1±0.11µs | 1269 tx/sec | 1766 tx/sec |
github-actions[bot] commented 2 months ago
Callgrind benchmark results # Callgrind Benchmark Report These benchmarks were run using [callgrind](https://valgrind.org/docs/manual/cg-manual.html), an instruction-level profiler. They allow comparisons between sqlite (`sqlite`), SpacetimeDB running through a module (`stdb_module`), and the underlying SpacetimeDB data storage engine (`stdb_raw`). Callgrind emulates a CPU to collect the below estimates. Measurement changes larger than five percent are in bold.
In-memory benchmarks ### callgrind: empty transaction | db | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | 6380 | 6148 | 3.77% | 7146 | 6970 | 2.53% | | sqlite | 5402 | 5558 | -2.81% | 5802 | 6000 | -3.30% | ### callgrind: filter | db | schema | indices | count | preload | _column | data_type | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|-------------------|-------|---------|---------|-----------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | no_index | 64 | 128 | 1 | u64 | 89870 | 89657 | 0.24% | 90760 | 90521 | 0.26% | | stdb_raw | u32_u64_str | no_index | 64 | 128 | 2 | string | 131154 | 130941 | 0.16% | 132326 | 132207 | 0.09% | | stdb_raw | u32_u64_str | btree_each_column | 64 | 128 | 2 | string | 29652 | 29439 | 0.72% | 30744 | 30713 | 0.10% | | stdb_raw | u32_u64_str | btree_each_column | 64 | 128 | 1 | u64 | 28610 | 28397 | 0.75% | 29512 | 29529 | -0.06% | | sqlite | u32_u64_str | no_index | 64 | 128 | 1 | u64 | 122944 | 122960 | -0.01% | 124248 | 124536 | -0.23% | | sqlite | u32_u64_str | no_index | 64 | 128 | 2 | string | 143597 | 143613 | -0.01% | 145191 | 145401 | -0.14% | | sqlite | u32_u64_str | btree_each_column | 64 | 128 | 2 | string | 133460 | 133470 | -0.01% | 135280 | 135388 | -0.08% | | sqlite | u32_u64_str | btree_each_column | 64 | 128 | 1 | u64 | 130261 | 130261 | 0.00% | 131759 | 131963 | -0.15% | ### callgrind: insert bulk | db | schema | indices | count | preload | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|-------------------|-------|---------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | unique_0 | 64 | 128 | **1369221** | **1478440** | -7.39% | **1405007** | **1512342** | -7.10% | | stdb_raw | u32_u64_str | btree_each_column | 64 | 128 | **1529509** | **1627578** | -6.03% | **1583605** | **1672954** | -5.34% | | sqlite | u32_u64_str | unique_0 | 64 | 128 | 396346 | 396352 | -0.00% | 411598 | 417236 | -1.35% | | sqlite | u32_u64_str | btree_each_column | 64 | 128 | 981664 | 981680 | -0.00% | 1018004 | 1024926 | -0.68% | ### callgrind: iterate | db | schema | indices | count | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|----------|-------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | unique_0 | 1024 | 179413 | 179181 | 0.13% | 179821 | 179577 | 0.14% | | stdb_raw | u32_u64_str | unique_0 | 64 | 19023 | 18791 | 1.23% | 19343 | 19071 | 1.43% | | sqlite | u32_u64_str | unique_0 | 1024 | 1044435 | 1044684 | -0.02% | 1047765 | 1048016 | -0.02% | | sqlite | u32_u64_str | unique_0 | 64 | 74507 | 74735 | -0.31% | 75723 | 75959 | -0.31% | ### callgrind: serialize_product_value | count | format | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |-------|--------|----------------------|--------------------------|-------|------------------|----------------------|----------| | 64 | json | 49521 | 49521 | 0.00% | 52309 | 52075 | 0.45% | | 64 | bsatn | 26627 | 26625 | 0.01% | 28871 | 28835 | 0.12% | | 16 | json | 12643 | 12643 | 0.00% | 14683 | 14483 | 1.38% | | 16 | bsatn | 8357 | 8355 | 0.02% | 9683 | 9613 | 0.73% | ### callgrind: update bulk | db | schema | indices | count | preload | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|----------|-------|---------|----------------------|--------------------------|---------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | unique_0 | 1024 | 1024 | **39772601** | **46712645** | -14.86% | **41304245** | **48124325** | -14.17% | | stdb_raw | u32_u64_str | unique_0 | 64 | 128 | **2408124** | **2738032** | -12.05% | **2519274** | **2838136** | -11.23% | | sqlite | u32_u64_str | unique_0 | 1024 | 1024 | 1801965 | 1801970 | -0.00% | 1811567 | 1811624 | -0.00% | | sqlite | u32_u64_str | unique_0 | 64 | 128 | 128494 | 128484 | 0.01% | 131396 | 131542 | -0.11% |
On-disk benchmarks ### callgrind: empty transaction | db | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | **6897** | **6390** | 7.93% | **7749** | **7208** | 7.51% | | sqlite | 5444 | 5600 | -2.79% | 5874 | 6072 | -3.26% | ### callgrind: filter | db | schema | indices | count | preload | _column | data_type | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|-------------------|-------|---------|---------|-----------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | no_index | 64 | 128 | 1 | u64 | 90395 | 89899 | 0.55% | 91415 | 90843 | 0.63% | | stdb_raw | u32_u64_str | no_index | 64 | 128 | 2 | string | 131679 | 131183 | 0.38% | 132961 | 132593 | 0.28% | | stdb_raw | u32_u64_str | btree_each_column | 64 | 128 | 1 | u64 | 29135 | 28639 | 1.73% | 30203 | 29767 | 1.46% | | stdb_raw | u32_u64_str | btree_each_column | 64 | 128 | 2 | string | 30177 | 29681 | 1.67% | 31451 | 30999 | 1.46% | | sqlite | u32_u64_str | no_index | 64 | 128 | 1 | u64 | 124880 | 124896 | -0.01% | 126606 | 126824 | -0.17% | | sqlite | u32_u64_str | no_index | 64 | 128 | 2 | string | 145518 | 145534 | -0.01% | 147494 | 147708 | -0.14% | | sqlite | u32_u64_str | btree_each_column | 64 | 128 | 2 | string | 135510 | 135520 | -0.01% | 137724 | 137696 | 0.02% | | sqlite | u32_u64_str | btree_each_column | 64 | 128 | 1 | u64 | 132357 | 132367 | -0.01% | 134201 | 134355 | -0.11% | ### callgrind: insert bulk | db | schema | indices | count | preload | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|-------------------|-------|---------|----------------------|--------------------------|---------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | unique_0 | 64 | 128 | **1330333** | **2289465** | -41.89% | **1367341** | **2325451** | -41.20% | | stdb_raw | u32_u64_str | btree_each_column | 64 | 128 | **1482902** | **2442453** | -39.29% | **1536640** | **2484943** | -38.16% | | sqlite | u32_u64_str | unique_0 | 64 | 128 | 413848 | 413864 | -0.00% | 428584 | 434394 | -1.34% | | sqlite | u32_u64_str | btree_each_column | 64 | 128 | 1019854 | 1019864 | -0.00% | 1058096 | 1063322 | -0.49% | ### callgrind: iterate | db | schema | indices | count | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|----------|-------|----------------------|--------------------------|--------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | unique_0 | 1024 | 179938 | 179423 | 0.29% | 180464 | 179867 | 0.33% | | stdb_raw | u32_u64_str | unique_0 | 64 | 19548 | 19033 | 2.71% | 19978 | 19353 | 3.23% | | sqlite | u32_u64_str | unique_0 | 1024 | 1047503 | 1047737 | -0.02% | 1051285 | 1051565 | -0.03% | | sqlite | u32_u64_str | unique_0 | 64 | 76273 | 76507 | -0.31% | 77559 | 77815 | -0.33% | ### callgrind: serialize_product_value | count | format | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |-------|--------|----------------------|--------------------------|-------|------------------|----------------------|----------| | 64 | json | 49521 | 49521 | 0.00% | 52309 | 52075 | 0.45% | | 64 | bsatn | 26627 | 26625 | 0.01% | 28871 | 28835 | 0.12% | | 16 | json | 12643 | 12643 | 0.00% | 14683 | 14483 | 1.38% | | 16 | bsatn | 8357 | 8355 | 0.02% | 9683 | 9613 | 0.73% | ### callgrind: update bulk | db | schema | indices | count | preload | total reads + writes | old total reads + writes | Δrw | estimated cycles | old estimated cycles | Δcycles | |----------|-------------|----------|-------|---------|----------------------|--------------------------|---------|------------------|----------------------|----------| | stdb_raw | u32_u64_str | unique_0 | 1024 | 1024 | **39425712** | **71962566** | -45.21% | **41249068** | **73721300** | -44.05% | | stdb_raw | u32_u64_str | unique_0 | 64 | 128 | **2335991** | **4175749** | -44.06% | **2456289** | **4281785** | -42.63% | | sqlite | u32_u64_str | unique_0 | 1024 | 1024 | 1809726 | 1809735 | -0.00% | 1818624 | 1818743 | -0.01% | | sqlite | u32_u64_str | unique_0 | 64 | 128 | 132631 | 132632 | -0.00% | 135617 | 135678 | -0.04% |
kim commented 2 months ago

@kazimuth It is quite possible that this skews some benchmarks, because appending to the log is just appending to a channel. Maybe draining that channel and flushing its contents to disk needs to be included in the measurement?