hashicorp / go-set

The go-set package provides generic Set implementations for Go, including HashSet for types with a Hash() function and TreeSet for orderable data
Mozilla Public License 2.0
118 stars 8 forks source link

v2: improvements to the API including breaking changes #73

Closed shoenig closed 11 months ago

shoenig commented 1 year ago

Over the past week during Gophercon 2023 I hacked away on a v2 branch of go-set. Having used the library for ~a year on a large project like Nomad I think we have learned enough to make a new version that is worthwhile of a major version bump. In particular the type parameter signatures of HashSet and TreeSet have changed to be less cumbersome and more flexible. It also swaps Common[T] for Collection[T], where the Collection interface requires the implementation of all non-type specific methods.

Collection[T]

Originally the Common[T] interface was intended only to be used internally as a minimal interface for DRY-ing up shared serialization implementations. In retrospect it makes sense to expand this interface as much as possible so that we can share even more common implementation code. In doing so we realized methods like InsertSet, ContainsSet, RemoveSet could operate on a Collection[T] rather than being tied specifically to *Set, *HashSet, or *TreeSet respectively. Which is nice.

Set

Struct type parameter signature stays the same.

Changes are around InsertSet / RemoveSet which now accept a Collection[T] which is compatible with HashSet and TreeSet (and anything implementing the Collection interface).

HashSet

Struct type parameter changes to enable support for storing types that do not define a .Hash method.

- type HashSet[T HashFunc[H], H Hash] struct {
+ type HashSet[T any, H Hash] struct {

We introduce

type HashFunc[T any, H Hash] func(T) H

and constructors for using a HashFunc for computing a hash on T, enabling callers to supply their own hash functions.

TreeSet

Struct type parameter changes to no longer require the unnecessary Compare[T] constraint.

- type TreeSet[T any, C Compare[T]] struct {
+ type TreeSet[T any] struct {

Having the C as part of the type parameter for TreeSet did not actually do anything, and was annoying to write out every time you created a set. We can just remove the type parameter.

e.g.

- ts := NewTreeSet[*token, Compare[*token]](compareTokens)
+ ts := NewTreeSet[*token](compareTokens)

Iterators

This v2 does not address the possible upcoming changes to the Go language which will introduce iterators over custom types. It's unclear when/if these proposals will be accepted, and we can always draft a v3 of this library if necessary.

https://github.com/golang/go/discussions/54245 https://github.com/golang/go/discussions/56413

Rollout

I'll create a v2.0.0-alpha.1 tag that we can use for a while to make sure the API changes are sufficient and "feel" right. Note that there may be breaking changes in between the alpha and a final v2.0.0 release. To import the v2 of this library the import statement becomes

go get github.com/hashicorp/go-set/v2@latest
import "github.com/hashicorp/go-set/v2"
shoenig commented 1 year ago

More changes - the definition of ContainsSlice and EqualSlice (and now EqualSliceSet)

Previously these methods were not well named for their behavior. Since v2 is making breaking changes we should just fix these now.

EDIT: published as v2.0.0-alpha.2

in v1

ContainsSlice - detect if a set is equal to a slice, but the slice might contain duplicates

EqualSlice - detect if a set is equal to a slice, and that slice must not contain duplicates

in v2

ContainsSlice - detect if a slice is a subset of a set (and the slice may contain duplicates)

EqualSlice - detect if a set is equal to a slice, and that slice may contain duplicates

EqualSliceSet - detect if a set is equal to a slice, and that slice must not contain duplicates