Closed gingerwizard closed 2 years ago
This also uses a SIMD optimised json parser to - getting our mean performance close to 1s.
Max: 7.019331
Min: 0.010152
Median: 1.0523820000000002
Mean: 1.3045525846613546
Harmonic Mean: 0.3188781817195705
95% Percentile: 3.500890149999998
Max: 6.804227
Min: 0.009796
Median: 1.0929375000000001
Mean: 1.354577529880478
Harmonic Mean: 0.324104079055473
95% Percentile: 3.7586722999999975
99% Percentile: 4.9927768000000015
Using skip pointers. Can't use the SIMD parser as efficiently as id like as its not thread safe.
Moving to a pure text encoding of positions and pointers, parsing manually vs json gives minor improvements
Max: 7.672702
Min: 0.00736
Median: 1.0564274999999999
Mean: 1.3496451603585657
I think this is worth staying with @lollobaldo since its alittle faster at indexing and doesnt rely on libs (simdjson) that is platform dependent.
I don't think these skip lists help enormously on AND queries - any early loop termination is offset by the increased cost of the read. They do lower the 95th and 99th percentiles though and improve the worse case soo merging.
Index side of skip pointers. Only index side is done.
Also disabled current bert model as we aren't using (yet) and it slows down startup.