ssbc / jitdb

A database on top of a log with automatic index generation and maintenance
50 stars 7 forks source link

ensure only one log.scan at a time #220

Closed staltz closed 2 years ago

staltz commented 2 years ago

Context

I was experimenting with Manyverse doing deleteFeed-then-compact-then-reindex, and it was odd that post-compaction reindexing lasted 6.5sec while initial indexing lasted 4.0sec.

Problem

I investigated deeper and the reason was that initial indexing triggers jitdb createIndexes which does one log scan for multiple indexes, but reindexing triggers jitdb updateIndex which does one log scan for each index.

Solution

Unify updateIndex and createIndexes such that both creation and update are performed by the same log.stream. This additionally ensures that jitdb never does concurrent log.streams, at any given point in time there is only one jitdb log.stream ongoing.

PS: The git diff is pretty nasty to read, but in reality the new implementation of updateIndexes() follows closely the structure of the two old functions.

github-actions[bot] commented 2 years ago

Benchmark results

Part Speed Heap Change Samples
Count 1 big index (3rd run) 0.61ms ± 0.44ms 44.32 B ± 25314.7 B 50
Create an index twice concurrently 607.43ms ± 3.64ms 16.86 kB ± 23.26 kB 90
Load core indexes 1.74ms ± 0.04ms 141.44 B ± 464.19 B 4985
Load two indexes concurrently 572.1ms ± 6ms -84.11 kB ± 165.12 kB 20
Paginate 10 results 24.61ms ± 0.52ms -455.04 B ± 46796.6 B 25
Paginate 20000 msgs with pageSize=5 7520.22ms ± 470.71ms 2.8 MB ± 3.77 MB 5
Paginate 20000 msgs with pageSize=500 697.21ms ± 9.07ms 37.76 kB ± 257.57 kB 16
Query 1 big index (1st run) 981.8ms ± 6.34ms -110.29 kB ± 89.78 kB 54
Query 1 big index (2nd run) 329.73ms ± 2.02ms 17.11 kB ± 42.97 kB 44
Query 3 indexes (1st run) 834.48ms ± 3.81ms -44.84 kB ± 59.79 kB 65
Query 3 indexes (2nd run) 272.87ms ± 1.01ms -3.09 kB ± 32.81 kB 48
Query a prefix map (1st run) 263.44ms ± 4.89ms 12.79 kB ± 150.62 kB 24
Query a prefix map (2nd run) 10.35ms ± 0.58ms 89.58 kB ± 255.12 kB 23
staltz commented 2 years ago

Previous benchmark results, for comparison: https://github.com/ssbc/jitdb/pull/219

Spoiler: looks okay

arj03 commented 2 years ago

Had a few comments, mostly to understand the changes. It is really nice that you unified these functions, not only because of the perfomance which is great, but also from a maintenance perspective. Net -100 lines is always nice :)

github-actions[bot] commented 2 years ago

Benchmark results

Part Speed Heap Change Samples
Count 1 big index (3rd run) 0.75ms ± 0.53ms -6.94 kB ± 24.49 kB 49
Create an index twice concurrently 877.73ms ± 9.14ms -1.78 kB ± 66.21 kB 61
Load core indexes 1.17ms ± 0.02ms 121.28 B ± 296.83 B 7067
Load two indexes concurrently 701.67ms ± 11.2ms 216.68 kB ± 267.44 kB 17
Paginate 10 results 24.01ms ± 0.44ms 12.69 kB ± 38.03 kB 25
Paginate 20000 msgs with pageSize=5 9445.31ms ± 90.38ms -2.04 MB ± 4.33 MB 5
Paginate 20000 msgs with pageSize=500 531.92ms ± 4.69ms 66.34 kB ± 713.87 kB 21
Query 1 big index (1st run) 796.79ms ± 5.7ms -23.59 kB ± 63.9 kB 68
Query 1 big index (2nd run) 323.95ms ± 1.06ms 13.98 kB ± 27.4 kB 47
Query 3 indexes (1st run) 1098.91ms ± 10.43ms -75.39 kB ± 82.17 kB 48
Query 3 indexes (2nd run) 262.13ms ± 1.21ms 44.23 kB ± 141.72 kB 51
Query a prefix map (1st run) 335.4ms ± 4.57ms -562.97 kB ± 870.77 kB 19
Query a prefix map (2nd run) 11.2ms ± 0.86ms -36.4 kB ± 200.09 kB 23
github-actions[bot] commented 2 years ago

Benchmark results

Part Speed Heap Change Samples
Count 1 big index (3rd run) 0.52ms ± 0.33ms 17.87 kB ± 46.83 kB 47
Create an index twice concurrently 774.33ms ± 9.26ms -4.77 kB ± 30.54 kB 69
Load core indexes 1.2ms ± 0.02ms 112.16 B ± 300.01 B 7014
Load two indexes concurrently 582.43ms ± 5.01ms 46.86 kB ± 172.42 kB 20
Paginate 10 results 21.97ms ± 0.47ms 12.67 kB ± 41.41 kB 27
Paginate 20000 msgs with pageSize=5 7854.52ms ± 308.76ms -2.57 MB ± 1.2 MB 5
Paginate 20000 msgs with pageSize=500 554.98ms ± 11.75ms 796.6 kB ± 915.98 kB 20
Query 1 big index (1st run) 856.47ms ± 14.2ms 5.86 kB ± 77.02 kB 63
Query 1 big index (2nd run) 324.25ms ± 1.49ms 12.18 kB ± 27.07 kB 47
Query 3 indexes (1st run) 977.51ms ± 30.77ms 39.88 kB ± 69.8 kB 54
Query 3 indexes (2nd run) 263.37ms ± 1.1ms 10.18 kB ± 183.93 kB 51
Query a prefix map (1st run) 344.04ms ± 2.99ms 65.66 kB ± 202.8 kB 19
Query a prefix map (2nd run) 10.28ms ± 0.53ms 82.86 kB ± 167.83 kB 23