Open earthboundkid opened 2 months ago
The name gives me more maxsize than minsize vibes.
Sorting map keys more efficiently would look like keys := slices.CollectN(maps.Keys(m), len(m); slices.Sort(keys).
I guess we'd rather change the implementation of slices.Sorted
to use CollectN
internally? But maybe not.
That said, the function signature is not particularly nice. I guess (sorry, no evidence) that the majority of use cases is around the functions of maps.KeysSlice()
and maps.ValuesSlice()
. If that's the case, then remaining cases could be covered as above by:
slices.Append(make([]K,0, len(m)), seq)
Are there other wide-spread cases that need to be covered?
The name gives me more maxsize than minsize vibes.
ChatGPT suggests the name slices.CollectWithCapacity
. Seems little long. CollectWithCap?
I guess we'd rather change the implementation of
slices.Sorted
to useCollectN
internally? But maybe not.
I don't understand. Are you suggesting adding slices.SortedWithCap
? It wouldn't be possible to use CollectWithCap internally since Sorted couldn't know the minimum length.
Sorted is just Collect+Sort, so I don't see the point of making a bunch of variations on it. If you need better Collecting, then use CollectWithCap, if not, use plain Sorted.
Are there other wide-spread cases that need to be covered?
I think this would be a good general purpose tool. You could use it with a hypothetical container/set type, for example, or if we ever genericize container/list and container/heap. I have a generic deque type I would have used this with if it had existed.
I have a generic deque type I would have used this with if it had existed.
I still assume (sorry), that we'll have 90% maps keys/values.
I am rehashing arguments that have already been made, but this whole API which is more mental load on the programmer and more places to make mistake wouldn't need to exists if the iter
form was an interface instead of a function value then we could extend it conditionally:
package iter
type Seq[V any] interface {
Iter(yield func(V) bool)
}
type Seq2[K, V any] interface {
Iter(yield func(K, V) bool)
}
// Len can be implemented by [Seq] and [Seq2] as optional features.
type Len interface {
// Len returns how many calls to yield Iter will do if ok == true.
// To be a correct implementation it MUST do n many calls, no more no less.
// When ok == false the length is not known.
Len() (n int, ok bool)
}
// Cap can be implemented by [Seq] and [Seq2] as optional features.
type Cap interface {
// Cap returns how many calls to yield Iter could do if ok == true.
// To be a correct implementation it MUST do at most n many calls, no more.
// When ok == false the capacity is not known.
Cap() (n int, ok bool)
}
package slices
import "iter"
func Collect[V any](i iter.Seq[V]) []V {
var sizeHint int
var hasSizeHint bool
if l, ok := i.(iter.Len); ok {
sizeHint, hasSizeHint = l.Len()
}
if !hasSizeHint {
if l, ok := i.(iter.Cap); ok {
sizeHint, hasSizeHint = l.Cap()
}
}
if !hasSizeHint {
sizeHint = 0
}
r := make([]V, 0, sizeHint)
for v := range i {
r = append(r, v) // if you actually submitted this code to std you probably would want hardening against buggy iterators that return wrong .Len and .Cap.
}
return r
}
Having it be optionally implemented would make this so much better, and more generally anything where you sometime can do some extra optimizations if you know the size ahead of time.
if the
iter
form was an interface instead of a function value then
But it isn't. Let's focus on building on top of what is now.
@fzipp @earthboundkid I would suggest the name "hint" or "sizehint" for the parameter, as it takes the same role as the size hint parameter for make with maps: It has no impact on the semantics or correctness of the program, but makes it more efficient if given correctly.
The name CollectN
seems pragmatic to me.
Like @fzipp above, to me the name CollectN
suggests "collect up to N elements into a slice."
And as others have observed we can already do this using slices.AppendSeq
.
I think this function might be premature at this point.
If an iterator must be used, consider adding a slices.CollectInto
function for preallocation need. Or don't use iterator at all.
@zigo101 That sounds like the existing slices.AppendSeq
.
I think it would be okay to put this proposal on hold for six months while we get used to using iterators in production.
@earthboundkid You can use nstd.CollectMapKeys now. :D
That's sort of what I mean. We can just use our personal iteration libraries for six months or so then regroup and see if we mostly want CollectWithCap (because we need to collect a lot of other containers into slices) or mostly want maps.KeySlice or maps.KeysSorted etc. based on real world usage. It's a little theoretical for now, although I have started doing some stuff with x/net/html iterators in production.
It might be a bad idea to put "slices.Collect" in std.
Imho it is very unfortunate that something as simple and often used as exp/maps/Keys()
should become something as unobvious as
keys := slices.CollectN(maps.Keys(m), len(m))
Majority of use cases is probably better addressed adding dedicated functions for maps.Keys and slices.Values. Suggesting to revisit #61626
If the main concern is about efficiently iterating over sorted keys of a map (and possibly its associated values), wouldn't it be addressed by the addition of the following functions (perhaps with better names) to the maps
package, functions that would use the knowledge of the number of keys in the map?
func Sorted[M ~map[K]V, K cmp.Ordered, V any](m M) iter.Seq2[K, V]
func SortedFunc[M ~map[K]V, K comparable, V any](m M, cmp func(K, K) int) iter.Seq2[K, V]
func SortedStableFunc[M ~map[K]V, K comparable, V any](m M, cmp func(K, K) int) iter.Seq2[K, V]
The need for slices.CollectN
would become moot, IMO.
Edit: I see that @earthboundkid had already suggested something similar (the addition of a SortedKeys
function) in https://github.com/golang/go/issues/61900#issuecomment-1917362371
Proposal Details
As discussed in #61900, it's a shame that
slices.Collect(maps.Keys(m))
andslices.Sorted(maps.Keys(m))
don't preallocate the size of m. You can work around it by writingslices.Sorted(slices.Append(make([]T, 0, len(m)), maps.Keys(m)))
, but that's kind of ugly.I propose adding
func CollectN[S ~[]T, T any](seq iter.Seq[T], minsize int) S
to slices that just returnsAppend(make([]T, 0, minsize), seq)
. It would help with map keys but also other container types where you know the length in advance or can at least make a good guess.Sorting map keys more efficiently would look like
keys := slices.CollectN(maps.Keys(m), len(m); slices.Sort(keys)
.