Open Kixiron opened 2 years ago
There are two forms of count
we could support: count unique values V
(count_distinct
in DDlog) or count the sum of all weights of values associated with a key. The latter is linear, but the former seems to be what people more often want in practice. Both can be implemented using aggregate
(a specialized implementation may be slightly more efficient, but I'm not sure it's worth it), but I agree that it needs to be packaged as a library method under src/operator
.
We need to add a
.count()
operator that counts the number of values for any given key, e.g.(K, V).count() -> (K, isize)