ensure only one log.scan at a time

staltz commented 2 years ago

Context

I was experimenting with Manyverse doing deleteFeed-then-compact-then-reindex, and it was odd that post-compaction reindexing lasted 6.5sec while initial indexing lasted 4.0sec.

Problem

I investigated deeper and the reason was that initial indexing triggers jitdb createIndexes which does one log scan for multiple indexes, but reindexing triggers jitdb updateIndex which does one log scan for each index.

Solution

Unify updateIndex and createIndexes such that both creation and update are performed by the same log.stream. This additionally ensures that jitdb never does concurrent log.streams, at any given point in time there is only one jitdb log.stream ongoing.

PS: The git diff is pretty nasty to read, but in reality the new implementation of updateIndexes() follows closely the structure of the two old functions.

github-actions[bot] commented 2 years ago

Benchmark results

Part	Speed	Heap Change	Samples
Count 1 big index (3rd run)	0.61ms ± 0.44ms	44.32 B ± 25314.7 B	50
Create an index twice concurrently	607.43ms ± 3.64ms	16.86 kB ± 23.26 kB	90
Load core indexes	1.74ms ± 0.04ms	141.44 B ± 464.19 B	4985
Load two indexes concurrently	572.1ms ± 6ms	-84.11 kB ± 165.12 kB	20
Paginate 10 results	24.61ms ± 0.52ms	-455.04 B ± 46796.6 B	25
Paginate 20000 msgs with pageSize=5	7520.22ms ± 470.71ms	2.8 MB ± 3.77 MB	5
Paginate 20000 msgs with pageSize=500	697.21ms ± 9.07ms	37.76 kB ± 257.57 kB	16
Query 1 big index (1st run)	981.8ms ± 6.34ms	-110.29 kB ± 89.78 kB	54
Query 1 big index (2nd run)	329.73ms ± 2.02ms	17.11 kB ± 42.97 kB	44
Query 3 indexes (1st run)	834.48ms ± 3.81ms	-44.84 kB ± 59.79 kB	65
Query 3 indexes (2nd run)	272.87ms ± 1.01ms	-3.09 kB ± 32.81 kB	48
Query a prefix map (1st run)	263.44ms ± 4.89ms	12.79 kB ± 150.62 kB	24
Query a prefix map (2nd run)	10.35ms ± 0.58ms	89.58 kB ± 255.12 kB	23

staltz commented 2 years ago

Previous benchmark results, for comparison: https://github.com/ssbc/jitdb/pull/219

Spoiler: looks okay

arj03 commented 2 years ago

Had a few comments, mostly to understand the changes. It is really nice that you unified these functions, not only because of the perfomance which is great, but also from a maintenance perspective. Net -100 lines is always nice :)

github-actions[bot] commented 2 years ago

Benchmark results

Part	Speed	Heap Change	Samples
Count 1 big index (3rd run)	0.75ms ± 0.53ms	-6.94 kB ± 24.49 kB	49
Create an index twice concurrently	877.73ms ± 9.14ms	-1.78 kB ± 66.21 kB	61
Load core indexes	1.17ms ± 0.02ms	121.28 B ± 296.83 B	7067
Load two indexes concurrently	701.67ms ± 11.2ms	216.68 kB ± 267.44 kB	17
Paginate 10 results	24.01ms ± 0.44ms	12.69 kB ± 38.03 kB	25
Paginate 20000 msgs with pageSize=5	9445.31ms ± 90.38ms	-2.04 MB ± 4.33 MB	5
Paginate 20000 msgs with pageSize=500	531.92ms ± 4.69ms	66.34 kB ± 713.87 kB	21
Query 1 big index (1st run)	796.79ms ± 5.7ms	-23.59 kB ± 63.9 kB	68
Query 1 big index (2nd run)	323.95ms ± 1.06ms	13.98 kB ± 27.4 kB	47
Query 3 indexes (1st run)	1098.91ms ± 10.43ms	-75.39 kB ± 82.17 kB	48
Query 3 indexes (2nd run)	262.13ms ± 1.21ms	44.23 kB ± 141.72 kB	51
Query a prefix map (1st run)	335.4ms ± 4.57ms	-562.97 kB ± 870.77 kB	19
Query a prefix map (2nd run)	11.2ms ± 0.86ms	-36.4 kB ± 200.09 kB	23

github-actions[bot] commented 2 years ago

Benchmark results

Part	Speed	Heap Change	Samples
Count 1 big index (3rd run)	0.52ms ± 0.33ms	17.87 kB ± 46.83 kB	47
Create an index twice concurrently	774.33ms ± 9.26ms	-4.77 kB ± 30.54 kB	69
Load core indexes	1.2ms ± 0.02ms	112.16 B ± 300.01 B	7014
Load two indexes concurrently	582.43ms ± 5.01ms	46.86 kB ± 172.42 kB	20
Paginate 10 results	21.97ms ± 0.47ms	12.67 kB ± 41.41 kB	27
Paginate 20000 msgs with pageSize=5	7854.52ms ± 308.76ms	-2.57 MB ± 1.2 MB	5
Paginate 20000 msgs with pageSize=500	554.98ms ± 11.75ms	796.6 kB ± 915.98 kB	20
Query 1 big index (1st run)	856.47ms ± 14.2ms	5.86 kB ± 77.02 kB	63
Query 1 big index (2nd run)	324.25ms ± 1.49ms	12.18 kB ± 27.07 kB	47
Query 3 indexes (1st run)	977.51ms ± 30.77ms	39.88 kB ± 69.8 kB	54
Query 3 indexes (2nd run)	263.37ms ± 1.1ms	10.18 kB ± 183.93 kB	51
Query a prefix map (1st run)	344.04ms ± 2.99ms	65.66 kB ± 202.8 kB	19
Query a prefix map (2nd run)	10.28ms ± 0.53ms	82.86 kB ± 167.83 kB	23

ssbc / jitdb