Open danieldietrich opened 8 years ago
in the default implementation of unzip*
, what do you recommend the concrete class? Hashset?
@danieldietrich
@nicholasren First of all: hold your horses. We currently can't start to implement 3.0.0, even if we create a branch to separate changes from the current master. In 2.1.0 there will be massive changes to the file structure. Package-private abstract collection classes may disappear and replaced by generated code of a new inline code generator. Merging all these changes in an ongoing, parallel 3.0.0 development (that will introduce breaking API changes) would be hell. Let's concentrate on 2.1.0 features first. There is plenty to do :)
However, we observe that Scala uses a HashSet:
scala> val bitset = scala.collection.BitSet(1, 2, 3, 4, 5, 6, 7, 8, 9)
bitset: scala.collection.BitSet = BitSet(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> val unzipped = bitset.unzip(i => (i, i))
unzipped: (scala.collection.Set[Int], scala.collection.Set[Int]) = (Set(5, 1, 6, 9, 2, 7, 3, 8, 4),Set(5, 1, 6, 9, 2, 7, 3, 8, 4))
scala> unzipped
res0: (scala.collection.Set[Int], scala.collection.Set[Int]) = (Set(5, 1, 6, 9, 2, 7, 3, 8, 4),Set(5, 1, 6, 9, 2, 7, 3, 8, 4))
scala> unzipped._1.getClass()
res1: Class[_ <: scala.collection.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet
at a first glance that looked somehow odd to me. A HashSet has no predictable order of elements. When unzipping a linked set or sorted set, I expect to keep the existing order.
As already mentioned in the description of this issue we don't know the comparators of the unzipped result of a sorted set, e.g.
// returns Tuple2<Integer, Set>, where Set is a HashSet
SortedSet(1, 2, 3).unzip(i -> Tuple(i, new NotComparable(i)));
However, a LinkedHashSet has insertion order, therefore the unzipped result may keep this order:
// returns Tuple2<Integer, Set>, where Set is a LinkedHashSet
LinkedSet(1, 2, 3).unzip(i -> Tuple(i, new Foo(i)));
This implies that we will use the following Set implementations:
HashSet.unzip --> HashSet
LinkedHashSet.unzip --> LinkedHashSet
AnySortedSetImpl.unzip --> HashSet
Btw - the same should be applicable to Map.unzip and Multimap.unzip. I will adjust the title of this issue.
Hi,
Sorry if the question is off topic, but will this issue (also) track the slight difference in return type between io.vavr.collection.Set#unzip
and io.vavr.collection.List#unzip
? Or should I open another issue (I could not find any similar)? Or is that intentional?
The former returns Tuple2<? extends Set<T1>, ? extends Set<T2>>
while the latter returns Tuple2<List<T1>, List<T2>>
(i.e. no extends
), and this is quite disturbing (and I don't see why they have different return types).
Example of weird/problematic code I came accross:
// Does not compile: "Bad return type in lambda expression: Tuple2<String, String> cannot be converted to Tuple2<? extends T1, ? extends T2>"
Set<String> set = HashSet.empty();
Tuple2<Set<String>, Set<String>> unzip = set.unzip(s -> Tuple.of("", ""));
// Compiles
List<String> list = List.empty();
Tuple2<List<String>, List<String>> unzip = list.unzip(s -> Tuple.of("", ""));
Is my code wrong? Or is there an inconsistency issue in method signatures? Thanks!
PS: I know I can work it around by wrapping the set.unzip(s -> Tuple.of("", ""))
with a Tuple.narrow(...)
, but it seems like extra work, and I don't see the benefit of such a method signature.
Hi @Sir4ur0n,
this is an interesting (counter-)example. Please create a new issue!
List is a 'sealed trait', so there cannot be sub-classes (see #1825). Other return types need to be declared using upper bounds (aka extends) in order to be able to adjust them when subclassing...
We will investigate what can be done. From what I can see we might remove the extends
clause (and drop the ability to return more specific types in subclasses).
Here are many (all?) Seq methods that are effected:
Seq<? extends Seq<T>> combinations();
Seq<? extends Seq<T>> combinations(int k);
Iterator<? extends Seq<T>> crossProduct(int power);
Seq<? extends Seq<T>> permutations();
Tuple2<? extends Seq<T>, ? extends Seq<T>> splitAt(int n);
Tuple2<? extends Seq<T>, ? extends Seq<T>> splitAt(Predicate<? super T> predicate);
Tuple2<? extends Seq<T>, ? extends Seq<T>> splitAtInclusive(Predicate<? super T> predicate);
<C> Map<C, ? extends Seq<T>> groupBy(Function<? super T, ? extends C> classifier);
Iterator<? extends Seq<T>> grouped(int size);
Option<? extends Seq<T>> initOption();
Tuple2<? extends Seq<T>, ? extends Seq<T>> partition(Predicate<? super T> predicate);
Iterator<? extends Seq<T>> slideBy(Function<? super T, ?> classifier);
Iterator<? extends Seq<T>> sliding(int size);
Iterator<? extends Seq<T>> sliding(int size, int step);
Tuple2<? extends Seq<T>, ? extends Seq<T>> span(Predicate<? super T> predicate);
Option<? extends Seq<T>> tailOption();
<T1, T2> Tuple2<? extends Seq<T1>, ? extends Seq<T2>> unzip(Function<? super T, Tuple2<? extends T1, ? extends T2>> unzipper);
<T1, T2, T3> Tuple3<? extends Seq<T1>, ? extends Seq<T2>, ? extends Seq<T3>> unzip3(Function<? super T, Tuple3<? extends T1, ? extends T2, ? extends T3>> unzipper);
I think we can't fall back to Seq everywhere. For example the tailOption() of a Stream should return Option<Stream> instead of Option<Seq>.
Instead of falling back to Seq return types we could add a generic self-type to each non-sealed collection interface - but this is not really an option for us:
interface Seq<T, M extends Seq<T, M>> {
Option<M> tailOption();
}
In the case of unzip and unzip3 it wouldn't work and the user API would is a mess.
Set<String, ?> set = HashSet.empty();
Option<Set<String, ?>> tail = set.tailOption();
Living with additional ? extends ...
declarations or using the narrow() function are the only viable solutions I see.
Example:
Set<String> set = HashSet.empty();
// does NOT compile
Tuple2<Set<String>, Set<String>> unzip = set.unzip(s -> Tuple.of("", ""));
// DOES compile
Tuple2<? extends Set<String>, ? extends Set<String>> unzip1 = set.unzip(s -> Tuple.of("", ""));
// further processing results in nice types again
Set<String> first = unzip1._1;
final Set<String> apply = unzip1.apply((set1, set2) -> {
final Set<String> s1 = set1.add("");
final Set<String> s2 = set2.add("");
return s1.union(s2);
});
We should close this issue (and #2046).
However, we should find a solution for methods that transform sorted collections (like SortedSet, SortedMap, ...) and return a (wrapped) self-type.
For example the unzip methods need to return a Set instead of a SortedSet (like we already stated above):
<T1, T2> Tuple2<? extends SortedSet<T1>, ? extends SortedSet<T2>> unzip(Function<? super T, Tuple2<? extends T1, ? extends T2>> unzipper);
See this comment and this comment.
When unzipping elements, information of specific Set impls might be missing regarding the return type. E.g. when unzipping a
BitSet
thetoInt
/fromInt
functions are missing for the new return type components. Another example is unzipping aSortedSet
. We can't return aTuple
ofSortedSet
s because the newComparator
s are unknown.In the case of
Set
s Scala'sunzip
always returns a tuple ofSet
s.Example: BitSet
Example: TreeSet
We should do the same - every
Set
impl'sunzip*
methods should return tuples ofSet
:These can be default methods that are valid for all sub-classes.