Closed staltz closed 2 years ago
Part | Speed | Heap Change | Samples |
---|---|---|---|
Count 1 big index (3rd run) | 0.61ms ± 0.44ms | 44.32 B ± 25314.7 B | 50 |
Create an index twice concurrently | 607.43ms ± 3.64ms | 16.86 kB ± 23.26 kB | 90 |
Load core indexes | 1.74ms ± 0.04ms | 141.44 B ± 464.19 B | 4985 |
Load two indexes concurrently | 572.1ms ± 6ms | -84.11 kB ± 165.12 kB | 20 |
Paginate 10 results | 24.61ms ± 0.52ms | -455.04 B ± 46796.6 B | 25 |
Paginate 20000 msgs with pageSize=5 | 7520.22ms ± 470.71ms | 2.8 MB ± 3.77 MB | 5 |
Paginate 20000 msgs with pageSize=500 | 697.21ms ± 9.07ms | 37.76 kB ± 257.57 kB | 16 |
Query 1 big index (1st run) | 981.8ms ± 6.34ms | -110.29 kB ± 89.78 kB | 54 |
Query 1 big index (2nd run) | 329.73ms ± 2.02ms | 17.11 kB ± 42.97 kB | 44 |
Query 3 indexes (1st run) | 834.48ms ± 3.81ms | -44.84 kB ± 59.79 kB | 65 |
Query 3 indexes (2nd run) | 272.87ms ± 1.01ms | -3.09 kB ± 32.81 kB | 48 |
Query a prefix map (1st run) | 263.44ms ± 4.89ms | 12.79 kB ± 150.62 kB | 24 |
Query a prefix map (2nd run) | 10.35ms ± 0.58ms | 89.58 kB ± 255.12 kB | 23 |
Previous benchmark results, for comparison: https://github.com/ssbc/jitdb/pull/219
Spoiler: looks okay
Had a few comments, mostly to understand the changes. It is really nice that you unified these functions, not only because of the perfomance which is great, but also from a maintenance perspective. Net -100 lines is always nice :)
Part | Speed | Heap Change | Samples |
---|---|---|---|
Count 1 big index (3rd run) | 0.75ms ± 0.53ms | -6.94 kB ± 24.49 kB | 49 |
Create an index twice concurrently | 877.73ms ± 9.14ms | -1.78 kB ± 66.21 kB | 61 |
Load core indexes | 1.17ms ± 0.02ms | 121.28 B ± 296.83 B | 7067 |
Load two indexes concurrently | 701.67ms ± 11.2ms | 216.68 kB ± 267.44 kB | 17 |
Paginate 10 results | 24.01ms ± 0.44ms | 12.69 kB ± 38.03 kB | 25 |
Paginate 20000 msgs with pageSize=5 | 9445.31ms ± 90.38ms | -2.04 MB ± 4.33 MB | 5 |
Paginate 20000 msgs with pageSize=500 | 531.92ms ± 4.69ms | 66.34 kB ± 713.87 kB | 21 |
Query 1 big index (1st run) | 796.79ms ± 5.7ms | -23.59 kB ± 63.9 kB | 68 |
Query 1 big index (2nd run) | 323.95ms ± 1.06ms | 13.98 kB ± 27.4 kB | 47 |
Query 3 indexes (1st run) | 1098.91ms ± 10.43ms | -75.39 kB ± 82.17 kB | 48 |
Query 3 indexes (2nd run) | 262.13ms ± 1.21ms | 44.23 kB ± 141.72 kB | 51 |
Query a prefix map (1st run) | 335.4ms ± 4.57ms | -562.97 kB ± 870.77 kB | 19 |
Query a prefix map (2nd run) | 11.2ms ± 0.86ms | -36.4 kB ± 200.09 kB | 23 |
Part | Speed | Heap Change | Samples |
---|---|---|---|
Count 1 big index (3rd run) | 0.52ms ± 0.33ms | 17.87 kB ± 46.83 kB | 47 |
Create an index twice concurrently | 774.33ms ± 9.26ms | -4.77 kB ± 30.54 kB | 69 |
Load core indexes | 1.2ms ± 0.02ms | 112.16 B ± 300.01 B | 7014 |
Load two indexes concurrently | 582.43ms ± 5.01ms | 46.86 kB ± 172.42 kB | 20 |
Paginate 10 results | 21.97ms ± 0.47ms | 12.67 kB ± 41.41 kB | 27 |
Paginate 20000 msgs with pageSize=5 | 7854.52ms ± 308.76ms | -2.57 MB ± 1.2 MB | 5 |
Paginate 20000 msgs with pageSize=500 | 554.98ms ± 11.75ms | 796.6 kB ± 915.98 kB | 20 |
Query 1 big index (1st run) | 856.47ms ± 14.2ms | 5.86 kB ± 77.02 kB | 63 |
Query 1 big index (2nd run) | 324.25ms ± 1.49ms | 12.18 kB ± 27.07 kB | 47 |
Query 3 indexes (1st run) | 977.51ms ± 30.77ms | 39.88 kB ± 69.8 kB | 54 |
Query 3 indexes (2nd run) | 263.37ms ± 1.1ms | 10.18 kB ± 183.93 kB | 51 |
Query a prefix map (1st run) | 344.04ms ± 2.99ms | 65.66 kB ± 202.8 kB | 19 |
Query a prefix map (2nd run) | 10.28ms ± 0.53ms | 82.86 kB ± 167.83 kB | 23 |
Context
I was experimenting with Manyverse doing deleteFeed-then-compact-then-reindex, and it was odd that post-compaction reindexing lasted 6.5sec while initial indexing lasted 4.0sec.
Problem
I investigated deeper and the reason was that initial indexing triggers jitdb
createIndexes
which does one log scan for multiple indexes, but reindexing triggers jitdbupdateIndex
which does one log scan for each index.Solution
Unify
updateIndex
andcreateIndexes
such that both creation and update are performed by the same log.stream. This additionally ensures that jitdb never does concurrent log.streams, at any given point in time there is only one jitdb log.stream ongoing.PS: The git diff is pretty nasty to read, but in reality the new implementation of
updateIndexes()
follows closely the structure of the two old functions.