vavr-io / vavr

vʌvr (formerly called Javaslang) is a non-commercial, non-profit object-functional library that runs with Java 8+. It aims to reduce the lines of code and increase code quality.
https://vavr.io
Other
5.71k stars 635 forks source link

Adjust return types of sorted collection methods that return a (wrapped) self type #1342

Open danieldietrich opened 8 years ago

danieldietrich commented 8 years ago

See this comment and this comment.


When unzipping elements, information of specific Set impls might be missing regarding the return type. E.g. when unzipping a BitSet the toInt/fromInt functions are missing for the new return type components. Another example is unzipping a SortedSet. We can't return a Tuple of SortedSets because the new Comparators are unknown.

In the case of Sets Scala's unzip always returns a tuple of Sets.

Example: BitSet

scala> val bitset = scala.collection.BitSet(1, 2, 3)
bitset: scala.collection.BitSet = BitSet(1, 2, 3)

scala> bitset.unzip(i => (i, i))
res1: (scala.collection.Set[Int], scala.collection.Set[Int]) = (Set(1, 2, 3),Set(1, 2, 3))

Example: TreeSet

scala> val treeset = scala.collection.immutable.TreeSet(1, 2, 3)
treeset: scala.collection.immutable.TreeSet[Int] = TreeSet(1, 2, 3)

scala> treeset.unzip(i => (i, i))
res2: (scala.collection.immutable.Set[Int], scala.collection.immutable.Set[Int]) = (Set(1, 2, 3),Set(1, 2, 3))

We should do the same - every Set impl's unzip* methods should return tuples of Set:

interface Set<T> {

    @Override
    default <T1, T2> Tuple2<Set<T1>, Set<T2>> unzip(Function<? super T, Tuple2<? extends T1, ? extends T2>> unzipper) {
        ...
    }

    @Override
    default <T1, T2, T3> Tuple3<Set<T1>, Set<T2>, Set<T3>> unzip3(Function<? super T, Tuple3<? extends T1, ? extends T2, ? extends T3>> unzipper) {
        ...
    }

}

These can be default methods that are valid for all sub-classes.

nicholasren commented 7 years ago

in the default implementation of unzip*, what do you recommend the concrete class? Hashset? @danieldietrich

danieldietrich commented 7 years ago

@nicholasren First of all: hold your horses. We currently can't start to implement 3.0.0, even if we create a branch to separate changes from the current master. In 2.1.0 there will be massive changes to the file structure. Package-private abstract collection classes may disappear and replaced by generated code of a new inline code generator. Merging all these changes in an ongoing, parallel 3.0.0 development (that will introduce breaking API changes) would be hell. Let's concentrate on 2.1.0 features first. There is plenty to do :)


However, we observe that Scala uses a HashSet:

scala> val bitset = scala.collection.BitSet(1, 2, 3, 4, 5, 6, 7, 8, 9)
bitset: scala.collection.BitSet = BitSet(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> val unzipped = bitset.unzip(i => (i, i))
unzipped: (scala.collection.Set[Int], scala.collection.Set[Int]) = (Set(5, 1, 6, 9, 2, 7, 3, 8, 4),Set(5, 1, 6, 9, 2, 7, 3, 8, 4))

scala> unzipped
res0: (scala.collection.Set[Int], scala.collection.Set[Int]) = (Set(5, 1, 6, 9, 2, 7, 3, 8, 4),Set(5, 1, 6, 9, 2, 7, 3, 8, 4))

scala> unzipped._1.getClass()
res1: Class[_ <: scala.collection.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet

at a first glance that looked somehow odd to me. A HashSet has no predictable order of elements. When unzipping a linked set or sorted set, I expect to keep the existing order.

As already mentioned in the description of this issue we don't know the comparators of the unzipped result of a sorted set, e.g.

// returns Tuple2<Integer, Set>, where Set is a HashSet
SortedSet(1, 2, 3).unzip(i -> Tuple(i, new NotComparable(i)));

However, a LinkedHashSet has insertion order, therefore the unzipped result may keep this order:

// returns Tuple2<Integer, Set>, where Set is a LinkedHashSet
LinkedSet(1, 2, 3).unzip(i -> Tuple(i, new Foo(i)));

This implies that we will use the following Set implementations:

HashSet.unzip --> HashSet
LinkedHashSet.unzip --> LinkedHashSet
AnySortedSetImpl.unzip --> HashSet

Btw - the same should be applicable to Map.unzip and Multimap.unzip. I will adjust the title of this issue.

sir4ur0n commented 7 years ago

Hi,

Sorry if the question is off topic, but will this issue (also) track the slight difference in return type between io.vavr.collection.Set#unzip and io.vavr.collection.List#unzip? Or should I open another issue (I could not find any similar)? Or is that intentional?
The former returns Tuple2<? extends Set<T1>, ? extends Set<T2>> while the latter returns Tuple2<List<T1>, List<T2>> (i.e. no extends), and this is quite disturbing (and I don't see why they have different return types).

Example of weird/problematic code I came accross:

// Does not compile: "Bad return type in lambda expression: Tuple2<String, String> cannot be converted to Tuple2<? extends T1, ? extends T2>"
Set<String> set = HashSet.empty();
Tuple2<Set<String>, Set<String>> unzip = set.unzip(s -> Tuple.of("", ""));

// Compiles
List<String> list = List.empty();
Tuple2<List<String>, List<String>> unzip = list.unzip(s -> Tuple.of("", ""));

Is my code wrong? Or is there an inconsistency issue in method signatures? Thanks!

PS: I know I can work it around by wrapping the set.unzip(s -> Tuple.of("", "")) with a Tuple.narrow(...), but it seems like extra work, and I don't see the benefit of such a method signature.

danieldietrich commented 7 years ago

Hi @Sir4ur0n,

this is an interesting (counter-)example. Please create a new issue!

List is a 'sealed trait', so there cannot be sub-classes (see #1825). Other return types need to be declared using upper bounds (aka extends) in order to be able to adjust them when subclassing...

We will investigate what can be done. From what I can see we might remove the extends clause (and drop the ability to return more specific types in subclasses).

danieldietrich commented 7 years ago

Here are many (all?) Seq methods that are effected:

Seq<? extends Seq<T>> combinations();
Seq<? extends Seq<T>> combinations(int k);
Iterator<? extends Seq<T>> crossProduct(int power);
Seq<? extends Seq<T>> permutations();
Tuple2<? extends Seq<T>, ? extends Seq<T>> splitAt(int n);
Tuple2<? extends Seq<T>, ? extends Seq<T>> splitAt(Predicate<? super T> predicate);
Tuple2<? extends Seq<T>, ? extends Seq<T>> splitAtInclusive(Predicate<? super T> predicate);
<C> Map<C, ? extends Seq<T>> groupBy(Function<? super T, ? extends C> classifier);
Iterator<? extends Seq<T>> grouped(int size);
Option<? extends Seq<T>> initOption();
Tuple2<? extends Seq<T>, ? extends Seq<T>> partition(Predicate<? super T> predicate);
Iterator<? extends Seq<T>> slideBy(Function<? super T, ?> classifier);
Iterator<? extends Seq<T>> sliding(int size);
Iterator<? extends Seq<T>> sliding(int size, int step);
Tuple2<? extends Seq<T>, ? extends Seq<T>> span(Predicate<? super T> predicate);
Option<? extends Seq<T>> tailOption();
<T1, T2> Tuple2<? extends Seq<T1>, ? extends Seq<T2>> unzip(Function<? super T, Tuple2<? extends T1, ? extends T2>> unzipper);
<T1, T2, T3> Tuple3<? extends Seq<T1>, ? extends Seq<T2>, ? extends Seq<T3>> unzip3(Function<? super T, Tuple3<? extends T1, ? extends T2, ? extends T3>> unzipper);

1) Falling back to Seq (No!)

I think we can't fall back to Seq everywhere. For example the tailOption() of a Stream should return Option<Stream> instead of Option<Seq>.

2) Using recursive self-type (No!)

Instead of falling back to Seq return types we could add a generic self-type to each non-sealed collection interface - but this is not really an option for us:

interface Seq<T, M extends Seq<T, M>> {

    Option<M> tailOption();    
}

In the case of unzip and unzip3 it wouldn't work and the user API would is a mess.

Set<String, ?> set = HashSet.empty();
Option<Set<String, ?>> tail = set.tailOption();

3) Leave API as-is

Living with additional ? extends ... declarations or using the narrow() function are the only viable solutions I see.

Example:

Set<String> set = HashSet.empty();

// does NOT compile
Tuple2<Set<String>, Set<String>> unzip = set.unzip(s -> Tuple.of("", ""));

// DOES compile
Tuple2<? extends Set<String>, ? extends Set<String>> unzip1 = set.unzip(s -> Tuple.of("", ""));

// further processing results in nice types again
Set<String> first = unzip1._1;

final Set<String> apply = unzip1.apply((set1, set2) -> {
    final Set<String> s1 = set1.add("");
    final Set<String> s2 = set2.add("");
    return s1.union(s2);
});

We should close this issue (and #2046).

However, we should find a solution for methods that transform sorted collections (like SortedSet, SortedMap, ...) and return a (wrapped) self-type.

For example the unzip methods need to return a Set instead of a SortedSet (like we already stated above):

<T1, T2> Tuple2<? extends SortedSet<T1>, ? extends SortedSet<T2>> unzip(Function<? super T, Tuple2<? extends T1, ? extends T2>> unzipper);