ssbc / jitdb

A database on top of a log with automatic index generation and maintenance
50 stars 7 forks source link

use bipf 1.6.0 seekKeyCached #209

Closed staltz closed 2 years ago

staltz commented 2 years ago

Context: https://github.com/ssb-ngi-pointer/jitdb/pull/208

Uses BIPF 1.6.0's seekKeyCached API instead of passing pValue around.

@arj03 How did you run those benchmarks in your PR? Could you try running them with this code? I know you did a good job in #208, but I think seekKeyCached is going to help us with pValueContent and potentially others.

arj03 commented 2 years ago

Yeah this is great. Can you create a db2 branch that builds on top of this? That makes it easier for me to test.

staltz commented 2 years ago

Working on it!

staltz commented 2 years ago

@arj03 One problem is that we have to have global and unique buffers for B_VALUE and others, otherwise we can end up having two B_VALUEs which are in different memory addresses, and the WeakMap idea won't work on them.

E.g.

const a = Buffer.from('value')
const b = Buffer.from('value')

a === b // false

const weakMap = new WeakMap()
weakMap.set(a, 3)
weakMap.has(b) // false
arj03 commented 2 years ago

Ugh, would it be possible to pass in value as a string. I think they match the same address or am I confusing javascript with .net ;-)?

staltz commented 2 years ago

Yeah, that's what I'm thinking about. bipf.seekKey(buffer, start, target) supports both string target and buffer target, which means we need to support both also for bipf.seekKeyCached. And thus I could use strings to do the WeakMap look ups. Which means that if you pass a buffer target in bipf.seekKeyCached(buffer,start,target) you would lose some performance benefits, but at least it would remain functioning correctly.

arj03 commented 2 years ago

My benchmark script, it's a bit hacky but gets the job done :)

const SecretStack = require('secret-stack')
const caps = require('ssb-caps')
const path = require('path')
const ssbKeys = require('ssb-keys')
const { where, and, type, isRoot, hasRoot, author, paginate, descending,
        toPullStream, toCallback } = require('../operators')
const bipf = require('bipf')
const pull = require('pull-stream')

const dir = './perf-testing/ssb'
const keys = ssbKeys.loadOrCreateSync(path.join('/home/arj/.ssb', 'secret'))

const ssb = SecretStack({ appKey: caps.shs })
  .use(require('../'))
  .use(require('../compat/ebt')) // ebt db helpers
  .use(require('../full-mentions')) // include index
  .call(null, {
    keys,
    path: dir,
    db2: {
      startDecryptBox1: "2022-03-25",
    }
  })

console.log('Doing the query')
console.time('query')

ssb.db.getJITDB().all(
  author('@QlCTpvY7p9ty2yOFrv1WU1AE88aoQc4Y7wYal7PFc+w=.ed25519'),
  0, false, false, null, (err, results) => {
    console.timeEnd('query')
    console.log(results.length)
  }
)

and then I delete all indexes to test both level + jitdb and only:

rm perf-testing/ssb/db2/indexes/seq* perf-testing/ssb/db2/indexes/timestamp.index perf-testing/ssb/db2/indexes/value_author.32prefix

to test only jitdb :)

And ./perf-testing/ssb/log.bipf is just a copy from ~/.ssb

github-actions[bot] commented 2 years ago

Benchmark results

Part Speed Heap Change Samples
Count 1 big index (3rd run) 0.34ms ± 0.03ms 21.79 kB ± 17.75 kB 43
Create an index twice concurrently 967.29ms ± 10.73ms -55.81 kB ± 58.37 kB 56
Load core indexes 1.44ms ± 0.02ms 94.29 B ± 254.99 B 6645
Load two indexes concurrently 582.47ms ± 28.99ms 73.17 kB ± 688.58 kB 15
Paginate 10 results 25.94ms ± 1.1ms -8.74 kB ± 15.72 kB 23
Paginate 20000 msgs with pageSize=5 7278.83ms ± 111.2ms -2.38 MB ± 3.17 MB 5
Paginate 20000 msgs with pageSize=500 722.68ms ± 5.93ms -50.7 kB ± 451.56 kB 17
Query 1 big index (1st run) 1118.65ms ± 11.94ms -11.06 kB ± 86.8 kB 48
Query 1 big index (2nd run) 280.89ms ± 2.12ms -1.05 kB ± 98.96 kB 41
Query 3 indexes (1st run) 1078.47ms ± 10.05ms -49.08 kB ± 147.74 kB 50
Query 3 indexes (2nd run) 279.6ms ± 6.79ms -33.63 kB ± 814.82 kB 33
Query a prefix map (1st run) 572.56ms ± 8.9ms -17.4 kB ± 742.15 kB 16
Query a prefix map (2nd run) 14.46ms ± 0.82ms 27.63 kB ± 28.13 kB 19
staltz commented 2 years ago

Parking this as draft for now because this PR can only be merged if it necessarily makes performance better.