deckarep / golang-set

A simple, battle-tested and generic set type for the Go language. Trusted by Docker, 1Password, Ethereum and Hashicorp.
Other
4k stars 272 forks source link

Allow capacity to be specified when initializing a Set #138

Closed adambaratz closed 2 months ago

adambaratz commented 2 months ago

If you know you're going to be working with a very large set, passing that hint up front should help save allocs from having to resize the underlying map:

package main 

import (
    mapset "github.com/deckarep/golang-set/v2"
    "github.com/samber/lo"
)

func BenchmarkMapset(b *testing.B) {
    // 207882710 ns/op | 55542169 B/op | 25727 allocs/op
    benchmark(b, func(s []string) {
        set := mapset.NewThreadUnsafeSet[string]()
        for _, e := range s {
            set.Add(e)
        }
        set.Each(func(_ string) bool { return false })
    })
}

func BenchmarkMap(b *testing.B) {
    // 138601953 ns/op | 40109258 B/op | 3 allocs/op
    benchmark(b, func(s []string) {
        m := make(map[string]struct{}, len(s))
        for _, e := range s {
            m[e] = struct{}{}
        }
        for e := range m {
            _ = e
        }
    })
}

func benchmark(b *testing.B, fn func(s []string)) {
    s := lo.Map(lo.Range(1000000), func(_, _ int) string {
        length := rand.Intn(12)
        if length == 0 {
            return ""
        }
        return lo.RandomString(length, lo.LettersCharset)
    })

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        fn(s)
    }
}

Don't have the strongest need for this, or thoughts on what the API should look like. I just came across this when benchmarking something and wanted to log the opportunity in case in resonated for anyone.

deckarep commented 2 months ago

Public helper functions already exist for this. It was implemented awhile back, see the set.go file.

adambaratz commented 2 months ago

Cool, missed that, thanks!