deckarep / golang-set

A simple, battle-tested and generic set type for the Go language. Trusted by Docker, 1Password, Ethereum and Hashicorp.
Other
4k stars 272 forks source link

[Question] Contains() parameters escape to heap #118

Closed vladislav-valkov-omniverse closed 6 months ago

vladislav-valkov-omniverse commented 1 year ago

I have noticed Contains() method leads its parameters to be allocated on the heap.

package main

import mapset "github.com/deckarep/golang-set/v2"

func main() {
    a := mapset.NewSet[int]()
    a.Contains(1)
}

Building the code above with go run -gcflags="-m" main.go shows that Contains() parameter escapes to heap: ./main.go:7:12: ... argument escapes to heap

I forked the library and ran several benchmarks:

func BenchmarkContainsSlice(b *testing.B) {
    slice := make([]int, 10000)
    for i := 0; i < 10000; i++ {
        slice[i] = i
    }

    set := mapset.NewThreadUnsafeSet[int](slice...)
    for i := 0; i < b.N; i++ {
        set.Contains(slice...)
    }
}

func BenchmarkContainsVariadic(b *testing.B) {
    slice := make([]int, 10000)
    for i := 0; i < 10000; i++ {
        slice[i] = i
    }

    set := mapset.NewThreadUnsafeSet[int](slice...)
    for i := 0; i < b.N; i++ {
        for i := 0; i < len(slice); i++ {
            set.Contains(i)
        }
    }
}

And got the following results:

goos: darwin
goarch: arm64
pkg: variadic-escaping-test
BenchmarkContainsSlice
BenchmarkContainsSlice-10               6258        176586 ns/op          42 B/op          0 allocs/op
BenchmarkContainsVariadic
BenchmarkContainsVariadic-10            4209        277876 ns/op       80063 B/op      10000 allocs/op

I tried adding a new method which does not use variadic parameter:

type Set[T comparable] interface {
    ContainsOne(val T) bool
}

func BenchmarkContainsNonVariadic(b *testing.B) {
    slice := make([]int, 10000)
    for i := 0; i < 10000; i++ {
        slice[i] = i
    }

    a := mapset.NewThreadUnsafeSet[int](slice...)
    for i := 0; i < b.N; i++ {
        for i := 0; i < len(slice); i++ {
            a.ContainsOne(i)
        }
    }
}

and got the following result for the new benchmark:

BenchmarkContainsNonVariadic
BenchmarkContainsNonVariadic-10         6762        170547 ns/op          39 B/op          0 allocs/op
deckarep commented 1 year ago

@vladislav-valkov-omniverse - what you are likely noticing is that variadic functions need to allocate a slice (behind the scenes) to accomplish handling all the arguments.

Now I could be wrong but I think that’s why and I don’t yet know a way around this.

Having a ContainsOne could help for checking a single item…but first I’d like to figure out if such a method is necessary.

Also, just to be pragmatic heap allocations aren’t the end of the world if the code isn’t in some hotspot.

vladislav-valkov-omniverse commented 1 year ago

@deckarep I usually see variadic parameters being marked as non-escaped - as all slices under certain size should be. I have found only one way to make such value be allocated on heap yet - variadic argument of generic type inside an interface. When I call the same method of a struct (or just replace the generic with a concrete type), everything works fine. Perhaps I should try searching for similar issues in Go repository.

In general, I agree heap allocations are not a disaster, but my specific usecase has a lot of checks involving sets, and this discovery was quite frustrating.

deckarep commented 1 year ago

@vladislav-valkov-omniverse - I understand the concern which I suppose would be an issue where you are checking for elements in a hot part of the code with the Contains method. I do agree with you that it would be nice to not see any allocs for this.

We can move forward with a ContainsOne addition to the API if you'd like to contribute the changes.

Just to be clear however, if you are having lots of contention with heap escaped data and you need finer grain control you might want to consider using a language which gives you such fine grained control such as Rust or Zig.

In Go, I prefer to have a clean and simple API that works in most general cases versus trying to squeeze out the absolute most performant package ever which is not a goal of this project.

That said, this project has landed dozens of great and impactful optimizations where it makes sense to do so while keeping a healthy and balanced API design.

deckarep commented 6 months ago

ContainsOne has been added to Release: 2.6.0