paulhoule / pidove

MIT License
46 stars 0 forks source link

Add Function similar to Python's groupby() #1

Open paulhoule opened 2 years ago

paulhoule commented 2 years ago

Python has a groupby function that works like the unix uniq command. It groups together a set of items together that share a common key into a single Iterable, and returns an Iterable of Pairs of keys and iterables. The signature would look something like

public static <X,K> Iterable<Pair<K,Iterable<X>> uniq(Function<X,K> keyFn, Iterable<X> source) {...}

if the source is sorted on the key first this is very similar to

collect(Collectors.groupBy(keyFn, collector))

except that the Streams API version keeps a hashtable of all the collectors throughout the process (uses memory) while the Python version just streams.

paulhoule commented 2 years ago

It is easy to feed the consecutive elements into an Iterator but returning an actual Iterable is not straightforward because an Iterable has to be restartable.

Possibly a better model is

public static <X,K,Y> Iterable<Pair<K,Y>> uniq(Function<X,K> keyFn, Collector<X,?,Z>) {
}