Open paulhoule opened 2 years ago
It is easy to feed the consecutive elements into an Iterator but returning an actual Iterable is not straightforward because an Iterable has to be restartable.
Possibly a better model is
public static <X,K,Y> Iterable<Pair<K,Y>> uniq(Function<X,K> keyFn, Collector<X,?,Z>) {
}
Python has a groupby function that works like the unix uniq command. It groups together a set of items together that share a common key into a single Iterable, and returns an Iterable of Pairs of keys and iterables. The signature would look something like
public static <X,K> Iterable<Pair<K,Iterable<X>> uniq(Function<X,K> keyFn, Iterable<X> source) {...}
if the
source
is sorted on the key first this is very similar tocollect(Collectors.groupBy(keyFn, collector))
except that the Streams API version keeps a hashtable of all the collectors throughout the process (uses memory) while the Python version just streams.