Closed staltz closed 2 years ago
Oh my 😱 It has been hiding in plain sight. Extra sneaky because of this:
const count = seq // incremented at the end of write()
save(count,....
Part | Speed | Heap Change | Samples |
---|---|---|---|
Count 1 big index (3rd run) | 0.46ms ± 0.24ms | 10.94 kB ± 30.1 kB | 47 |
Create an index twice concurrently | 711.42ms ± 4.55ms | 11.86 kB ± 27.74 kB | 75 |
Load core indexes | 1.65ms ± 0.03ms | 90.38 B ± 410.46 B | 5382 |
Load two indexes concurrently | 555.25ms ± 3.78ms | 117.33 kB ± 233.06 kB | 20 |
Paginate 10 results | 22.5ms ± 0.97ms | 8.69 kB ± 75.13 kB | 24 |
Paginate 20000 msgs with pageSize=5 | 6831.19ms ± 219.93ms | 2.59 MB ± 3.75 MB | 5 |
Paginate 20000 msgs with pageSize=500 | 645.46ms ± 8.67ms | 32.31 kB ± 940.93 kB | 18 |
Query 1 big index (1st run) | 806.64ms ± 4.76ms | 53.3 kB ± 71.45 kB | 67 |
Query 1 big index (2nd run) | 323.84ms ± 1.52ms | 10.59 kB ± 38.84 kB | 45 |
Query 3 indexes (1st run) | 809.17ms ± 3.51ms | 40.08 kB ± 61.2 kB | 67 |
Query 3 indexes (2nd run) | 308.8ms ± 7.05ms | -16.43 kB ± 37.65 kB | 39 |
Query a prefix map (1st run) | 337.61ms ± 2.13ms | 253.71 kB ± 428.17 kB | 20 |
Query a prefix map (2nd run) | 14.46ms ± 0.98ms | 12.86 kB ± 282.76 kB | 18 |
Yeah this one was very sneaky. It's sad that the solution to this bug is literally two keyboard strokes. :smiling_face_with_tear:
Part | Speed | Heap Change | Samples |
---|---|---|---|
Count 1 big index (3rd run) | 0.64ms ± 0.44ms | -12.27 kB ± 42.99 kB | 37 |
Create an index twice concurrently | 813.49ms ± 5.5ms | -5.89 kB ± 35.12 kB | 66 |
Load core indexes | 1.73ms ± 0.03ms | 94.2 B ± 469.71 B | 5050 |
Load two indexes concurrently | 707.38ms ± 16.67ms | 248.49 kB ± 302.46 kB | 16 |
Paginate 10 results | 20.2ms ± 0.18ms | 52.4 kB ± 67.7 kB | 28 |
Paginate 20000 msgs with pageSize=5 | 8054.54ms ± 176.29ms | -145.32 kB ± 1419.16 kB | 5 |
Paginate 20000 msgs with pageSize=500 | 556.7ms ± 12.84ms | 259.1 kB ± 555.41 kB | 20 |
Query 1 big index (1st run) | 752.23ms ± 5.21ms | -5.01 kB ± 68.58 kB | 72 |
Query 1 big index (2nd run) | 319.28ms ± 2.24ms | -4.92 kB ± 36.04 kB | 45 |
Query 3 indexes (1st run) | 867.16ms ± 5.5ms | 22.55 kB ± 77.74 kB | 62 |
Query 3 indexes (2nd run) | 332.94ms ± 3.95ms | -10.46 kB ± 39.8 kB | 37 |
Query a prefix map (1st run) | 310.24ms ± 4.59ms | 115.64 kB ± 119.92 kB | 21 |
Query a prefix map (2nd run) | 12.77ms ± 1.65ms | 75.81 kB ± 219.88 kB | 20 |
Context
https://github.com/ssbc/jitdb/issues/234
Problem
We have an off-by-one mistake in the usage of the
save(count, offset, done)
function inupdateIndexes()
.The first argument is
count
but we passedseq
to it, while the intention was to passseq + 1
, because seq is zero-based, it's used to pick items from thetarr
array.When the log stream ends, we call
save()
correctly, but when we callsave()
every 1min during the log stream, we passed the wrong value. In Manyverse, this meant that while indexing was in-progress, the intermediate saved states were not correct, they only got correct when the log stream ended. So if the app crashed in the middle, it would start from the wrong point on the next app startup.Before this PR, the
getSeqFromOffsets
would give the wrong value, but it was doing a binary search from positions0
tocount - 1
in thetarr
, while the correct value was actually at positioncount
. So this meant thatgetSeqFromOffsets
returned seq0
, and this would then create a disalignment between "seqs" and "offsets" on other indexes.I have a strong suspicion that this is THE indexing bug that we've been trying to hunt down for months/years now, because this bug has been here since jitdb@2.3.3, which is the version of jitdb we used in Manyverse the first time we put ssb-db2 in.
Solution
Make sure that
save()
's first argument is always semantically the "count".