Add Tuple[N] element equality check

jdimeo commented 5 years ago

When doing crossJoins, often you want to skip the element that is the diagonal of the matrix where the same element is in all elements of the tuple. I was wanting to do: Seq.crossJoin(list, list).filter(Tuple2::elementsNotEqual).forEach(tuple -> ...); or Seq.crossJoin(list, list).filter($ -> !$.elementsEqual()).forEach(tuple -> ...);

I know it's as simple as: Seq.crossJoin(list, list).filter($ -> $.v1() != $.v2()).forEach(tuple -> ...); but it would useful to have a generalized and consistent (and null safe?) way of handling this in the generated tuple code.

Thanks!

Versions:

jOOλ: 0.9.14
Java: 8

knutwannheden commented 5 years ago

Thank you for your feature request.

I think for your use case you could use the existing Seq#innerJoin(Iterable, BiPredicate) as in e.g.:

Seq.seq(list).innerJoin(list, (o1, o2) -> !Objects.equals(o1, o2));

or in case you really want to join the list with itself in this way:

Seq.seq(list).innerSelfJoin((o1, o2) -> !Objects.equals(o1, o2));

lukaseder commented 5 years ago

When doing crossJoins, often you want to skip the element that is the diagonal of the matrix where the same element is in all elements of the tuple

I just had to do this this week! :) And I did it like @knutwannheden suggested, using an innerSelfJoin. I don't think we need another explicit operator for this. Already this innerSelfJoin is a bit esoteric...

jdimeo commented 5 years ago

Great suggestion, but that doesn't easily scale to the n case. Even just 3 or 4 tuples involve a lot of Object.equals() calls to handle all the needed tests,

jdimeo commented 5 years ago

Even apart from this use case, I was assuming some kind of isAllSameElement() on the n Tuple would be generally useful, but perhaps not.

knutwannheden commented 5 years ago

Great suggestion, but that doesn't easily scale to the n case. Even just 3 or 4 tuples involve a lot of Object.equals() calls to handle all the needed tests,

Can you show me an example of what you had in mind for the n case?

Even apart from this use case, I was assuming some kind of isAllSameElement() on the n Tuple would be generally useful, but perhaps not.

For this you would at the moment have to do something like tuple.toList().stream().distinct().count(). Perhaps that would make sense as a utility. I would just like to see the concrete use case.

lukaseder commented 5 years ago

Great suggestion, but that doesn't easily scale to the n case. Even just 3 or 4 tuples involve a lot of Object.equals() calls to handle all the needed tests,

Indeed, there are O(n^2) comparisons to do, that's the nature of a cross join. You could avoid the comparison by doing something like:

List<Integer> list = Arrays.asList(1, 2, 3, 4);

Seq.seq(list)
    .zipWithIndex()
    .crossApply(e -> {
        var split = Seq.seq(list).splitAt(e.v2.intValue());
        return split.v1.concat(split.v2.drop(1));
    })
    .map(e -> tuple(e.v1.v1, e.v2))
    .forEach(System.out::println);

Or, using flatMap():

List<Integer> list = Arrays.asList(1, 2, 3, 4);

Seq.seq(list)
    .zipWithIndex()
    .flatMap(e1 -> {
        var split = Seq.seq(list).splitAt(e1.v2.intValue());
        return split.v1.concat(split.v2.drop(1)).map(e2 -> Tuple.tuple(e1.v1, e2));
    })
    .map(e -> e)
    .forEach(System.out::println);

This yields:

(1, 2)
(1, 3)
(1, 4)
(2, 1)
(2, 3)
(2, 4)
(3, 1)
(3, 2)
(3, 4)
(4, 1)
(4, 2)
(4, 3)

I'm personally not convinced that this is a popular enough use-case (and the equality check is too expensive, because if you're reusing the same list, you could use an identity check instead) to warrant for dedicated API.

Even apart from this use case, I was assuming some kind of isAllSameElement() on the n Tuple would be generally useful, but perhaps not.

The same here. A tuple of "all the same elements" sounds a bit too esoteric to justify dedicated API.

Side note

Note that each API element in a library as abstract as jOOλ causes cognitive overhead with users. They have to look at more and more API and names, and make sense of what these things even do. Take Seq.intersperse() vor example. It was inspired by haskell. This is not a term that makes it obvious what it does. Nor is it a very popular thing to do. But I think it is more popular than isAllSameElement on a tuple, and it has a precedent in haskell.

Over the years, I've grown more reluctant of adding new API to jOOλ, especially when there's a small set of relatively concise alternatives that do the same thing.

Also, if you're really concerned with performance, I suggest resorting to imperative logic. Streams have a lot of overhead, and jOOλ has even more overhead. There are lots of allocations happening, that can often be avoided in imperative logic easily, but not with these APIs. Take zipWithIndex(), which I suggested using. This will allocate a Long instance for each list element, just because we cannot easily have generics with primitive types. Not a problem in imperative logic.

I hope this helps.

jdimeo commented 5 years ago

As always, thank you both for being super responsive. I did in fact switch to imperative for my use case (because I also wanted to do upper triangle, so the classic for (int j = i + 1 ... nested for loops did the trick). I understand and agree with everything you have said.

jOOQ / jOOL

Add Tuple[N] element equality check #361

Versions: