use bipf 1.6.0 seekKeyCached

staltz commented 2 years ago

Context: https://github.com/ssb-ngi-pointer/jitdb/pull/208

Uses BIPF 1.6.0's seekKeyCached API instead of passing pValue around.

@arj03 How did you run those benchmarks in your PR? Could you try running them with this code? I know you did a good job in #208, but I think seekKeyCached is going to help us with pValueContent and potentially others.

arj03 commented 2 years ago

Yeah this is great. Can you create a db2 branch that builds on top of this? That makes it easier for me to test.

staltz commented 2 years ago

Working on it!

staltz commented 2 years ago

@arj03 One problem is that we have to have global and unique buffers for B_VALUE and others, otherwise we can end up having two B_VALUEs which are in different memory addresses, and the WeakMap idea won't work on them.

E.g.

const a = Buffer.from('value')
const b = Buffer.from('value')

a === b // false

const weakMap = new WeakMap()
weakMap.set(a, 3)
weakMap.has(b) // false

arj03 commented 2 years ago

Ugh, would it be possible to pass in value as a string. I think they match the same address or am I confusing javascript with .net ;-)?

staltz commented 2 years ago

Yeah, that's what I'm thinking about. bipf.seekKey(buffer, start, target) supports both string target and buffer target, which means we need to support both also for bipf.seekKeyCached. And thus I could use strings to do the WeakMap look ups. Which means that if you pass a buffer target in bipf.seekKeyCached(buffer,start,target) you would lose some performance benefits, but at least it would remain functioning correctly.

arj03 commented 2 years ago

My benchmark script, it's a bit hacky but gets the job done :)

const SecretStack = require('secret-stack')
const caps = require('ssb-caps')
const path = require('path')
const ssbKeys = require('ssb-keys')
const { where, and, type, isRoot, hasRoot, author, paginate, descending,
        toPullStream, toCallback } = require('../operators')
const bipf = require('bipf')
const pull = require('pull-stream')

const dir = './perf-testing/ssb'
const keys = ssbKeys.loadOrCreateSync(path.join('/home/arj/.ssb', 'secret'))

const ssb = SecretStack({ appKey: caps.shs })
  .use(require('../'))
  .use(require('../compat/ebt')) // ebt db helpers
  .use(require('../full-mentions')) // include index
  .call(null, {
    keys,
    path: dir,
    db2: {
      startDecryptBox1: "2022-03-25",
    }
  })

console.log('Doing the query')
console.time('query')

ssb.db.getJITDB().all(
  author('@QlCTpvY7p9ty2yOFrv1WU1AE88aoQc4Y7wYal7PFc+w=.ed25519'),
  0, false, false, null, (err, results) => {
    console.timeEnd('query')
    console.log(results.length)
  }
)

and then I delete all indexes to test both level + jitdb and only:

rm perf-testing/ssb/db2/indexes/seq* perf-testing/ssb/db2/indexes/timestamp.index perf-testing/ssb/db2/indexes/value_author.32prefix

to test only jitdb :)

And ./perf-testing/ssb/log.bipf is just a copy from ~/.ssb

github-actions[bot] commented 2 years ago

Benchmark results

Part	Speed	Heap Change	Samples
Count 1 big index (3rd run)	0.34ms ± 0.03ms	21.79 kB ± 17.75 kB	43
Create an index twice concurrently	967.29ms ± 10.73ms	-55.81 kB ± 58.37 kB	56
Load core indexes	1.44ms ± 0.02ms	94.29 B ± 254.99 B	6645
Load two indexes concurrently	582.47ms ± 28.99ms	73.17 kB ± 688.58 kB	15
Paginate 10 results	25.94ms ± 1.1ms	-8.74 kB ± 15.72 kB	23
Paginate 20000 msgs with pageSize=5	7278.83ms ± 111.2ms	-2.38 MB ± 3.17 MB	5
Paginate 20000 msgs with pageSize=500	722.68ms ± 5.93ms	-50.7 kB ± 451.56 kB	17
Query 1 big index (1st run)	1118.65ms ± 11.94ms	-11.06 kB ± 86.8 kB	48
Query 1 big index (2nd run)	280.89ms ± 2.12ms	-1.05 kB ± 98.96 kB	41
Query 3 indexes (1st run)	1078.47ms ± 10.05ms	-49.08 kB ± 147.74 kB	50
Query 3 indexes (2nd run)	279.6ms ± 6.79ms	-33.63 kB ± 814.82 kB	33
Query a prefix map (1st run)	572.56ms ± 8.9ms	-17.4 kB ± 742.15 kB	16
Query a prefix map (2nd run)	14.46ms ± 0.82ms	27.63 kB ± 28.13 kB	19

staltz commented 2 years ago

Parking this as draft for now because this PR can only be merged if it necessarily makes performance better.

ssbc / jitdb

use bipf 1.6.0 seekKeyCached #209

Benchmark results