google / btree

BTree provides a simple, ordered, in-memory data structure for Go programs.
Apache License 2.0
3.96k stars 420 forks source link

Generic approach #41

Closed Michal-Leszczynski closed 2 years ago

Michal-Leszczynski commented 2 years ago

With support for generics, we can replace interfaces with generic types. The use of generics avoids variables escaping to heap when calling the B-tree functions yielding 20-30% improvement.

Below are some comparisons between benchmarks (taken from original repository) done with benchstat:

name                                 old time/op    new time/op    delta
Insert-8                                121ns ± 1%      89ns ± 1%   -27.04%  (p=0.008 n=5+5)
Seek-8                                  115ns ± 1%      78ns ± 0%   -31.56%  (p=0.008 n=5+5)
DeleteInsert-8                          248ns ± 0%     185ns ± 1%   -25.51%  (p=0.008 n=5+5)
DeleteInsertCloneEachTime-8            1.10µs ± 5%    0.61µs ± 1%   -44.45%  (p=0.008 n=5+5)
Delete-8                                138ns ± 1%     101ns ± 1%   -26.62%  (p=0.008 n=5+5)
Get-8                                   102ns ± 1%      71ns ± 0%   -30.46%  (p=0.008 n=5+5)
Ascend-8                               40.2µs ± 1%    31.7µs ± 0%   -21.18%  (p=0.008 n=5+5)
Descend-8                              39.3µs ± 1%    30.7µs ± 1%   -21.91%  (p=0.008 n=5+5)
AscendRange-8                          72.3µs ± 1%    57.6µs ± 1%   -20.39%  (p=0.008 n=5+5)
DescendRange-8                         92.9µs ± 1%    77.6µs ± 1%   -16.45%  (p=0.008 n=5+5)

name                                 old alloc/op   new alloc/op   delta
Insert-8                                35.6B ± 4%     18.4B ± 3%   -48.31%  (p=0.008 n=5+5)
Seek-8                                  7.00B ± 0%     0.00B       -100.00%  (p=0.008 n=5+5)
DeleteInsert-8                          0.00B          0.00B           ~     (all equal)
DeleteInsertCloneEachTime-8            2.98kB ± 5%    1.91kB ± 1%   -36.16%  (p=0.008 n=5+5)
Delete-8                                0.00B          0.00B           ~     (all equal)
Get-8                                   0.00B          0.00B           ~     (all equal)
Ascend-8                                0.00B          0.00B           ~     (all equal)
Descend-8                               0.00B          0.00B           ~     (all equal)
AscendRange-8                           0.00B          0.00B           ~     (all equal)
DescendRange-8                          0.00B          0.00B           ~     (all equal)

name                                 old allocs/op  new allocs/op  delta
Insert-8                                 0.00           0.00           ~     (all equal)
Seek-8                                   0.00           0.00           ~     (all equal)
DeleteInsert-8                           0.00           0.00           ~     (all equal)
DeleteInsertCloneEachTime-8              11.0 ± 0%      11.0 ± 0%      ~     (all equal)
Delete-8                                 0.00           0.00           ~     (all equal)
Get-8                                    0.00           0.00           ~     (all equal)
Ascend-8                                 0.00           0.00           ~     (all equal)
Descend-8                                0.00           0.00           ~     (all equal)
AscendRange-8                            0.00           0.00           ~     (all equal)
DescendRange-8                           0.00           0.00           ~     (all equal)

In case of plain Insert Benchmark, the results are even better:

name      old time/op    new time/op    delta
Insert-8     200ns ± 2%     137ns ± 1%   -31.20%  (p=0.008 n=5+5)

name      old alloc/op   new alloc/op   delta
Insert-8     60.0B ± 0%     27.0B ± 0%   -55.00%  (p=0.008 n=5+5)

name      old allocs/op  new allocs/op  delta
Insert-8      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
func BenchmarkInsert(b *testing.B) {
    b.StopTimer()
    tr := btree.New(32)
    b.StartTimer()
    for i := 0; i < b.N; i++ {
        tr.ReplaceOrInsert(btree.Int(i))
    }
}

Unfortunately the functions that return nil do not work ex. func (t *BTree) Delete(item Item) Item. We can't simply return zero value for generic type, because it's still a valid value, so I decided to change the API, so that:

func (t *BTree[T]) Delete(item T) (T, bool) { ... }

we also return bool which indicates whether anything was really deleted. Of course, we can implement B-tree on pointers to generic types, but it harms performance in many cases (like having B-tree for numeric types).

Changes of API destroy backward compatibility, but because of reasons mentioned above, I believe that this approach is justified.

Here is link to forked repository with generic B-tree: https://github.com/Michal-Leszczynski/btree

How do you suggest we further proceed with this effort.

SaveTheRbtz commented 2 years ago

Came here with exactly the same proposal (and similar benchmarks). Maybe it is time to create a v2 release?

mmatczuk commented 2 years ago

Perf analysis https://www.scylladb.com/2022/04/27/shaving-40-off-googles-b-tree-implementation-with-go-generics/

mmatczuk commented 2 years ago

@Michal-Leszczynski please send a PR. Merging this would require go mod v2. In the context of performance you can consider sending a followup PR that replaces slice with array in node items.

gconnell commented 2 years ago

Fixed by #45