Closed jmr closed 4 years ago
The following table shows build speed [1,000 keys/second].
-----:|-------------------:|---------------------: 1 | 1,054.64| 1,087.85 2 | 915.17| 937.07 3 | 901.00| 920.59 4 | 896.08| 914.57 5 | 894.30| 912.47
jmr:build-index is 2-3% faster than s-yata:master.
jmr:build-index is 2-3% faster than s-yata:master.
Did you configure with --enable-native-code
? popcnt
and select_bit
are going to be important.
My benchmark was just on BitVector::build_index
. I don't know what fraction of marisa-benchmark
is spent in build_index
, so I can't say whether more than 2-3% is expected.
I will have time to run/profile the benchmarks myself later in the week.
The table shows the speed of dictionary construction and BitVector::build_index
is not a major part of it.
However, I think the improvement is enough to accept this pull request.
It looks good tome. Thank you!
Process the
BitVector
unit-by-unit instead of bit-by-bit.Use
PopCount::count()
to updatenum_1s
and useselect_bit
to find the bit positions for theselect0s_
andselect1s_
indexes.According to my benchmarks, the old bit-by-bit version processed a 256kbit vector at about 20MB/s independent of
enables_select0
andenables_select1
.The new version is 50x-150x faster, depending on the compiler and build_index options.
enables_select_0=enables_select1=false
: popcnt, no bmi2: 1600MB/s popcnt, no bmi2: 2500MB/s popcnt and bmi2: 2900MB/senables_select_0=enables_select1=true
: no popcnt, no bmi2: 1100MB/s popcnt, no bmi2: 1600MB/s popcnt and bmi2: 1800MB/s