vavr-io / vavr

vʌvr (formerly called Javaslang) is a non-commercial, non-profit object-functional library that runs with Java 8+. It aims to reduce the lines of code and increase code quality.
https://vavr.io
Other
5.73k stars 637 forks source link

Implement crossProduct(int) #614

Closed danieldietrich closed 9 years ago

danieldietrich commented 9 years ago

There already exists crossProduct() which returns a Seq<Tuple2<T, T>>.

It would be helpful to have additionally a crossProduct(int n) that returns a Seq<? extends Seq<T>>.

Example:

// ((1, 1, 1), (1, 1, 2), (1, 2, 1), (1, 2, 2), (2, 1, 1), (2, 1, 2), (2, 2, 1), (2, 2, 2))
List.of(1, 2).crossProduct(3);

Caution:

seq.crossProdukt(k) = ⏳ // for relatively small seq.length() and k
danieldietrich commented 9 years ago

613, #614 and #615 are driven by the jOOQ blog post How to use Java 8 Functional Programming to Generate an Alphabetic Sequence.

Having these, we should be able to write the following to solve the puzzle:

Seq<Character> alphabet = CharSeq.rangeClosed('A', 'Z');
Stream.from(1)
      .flatMap(alphabet::crossProduct)
      .map(Seq::mkString)
      .takeUntil("AAA"::equals);

/cc @lukaseder

lukaseder commented 9 years ago

That's an elegant approach. Although, I wonder if the naming is perhaps a bit confusing. I'd expect (1, 2) x (3) to yield (1, 3), (2, 3), not what you're suggesting. Perhaps, there's a more accurate name for taking a list "to the power of 3"

danieldietrich commented 9 years ago

Yes, you are right. I have to think about it. Maybe power()...

lukaseder commented 9 years ago

Cartesian power, indeed: https://en.wikipedia.org/wiki/Cartesian_product#Cartesian_power

danieldietrich commented 9 years ago

Yes, http://de.mathworks.com/help/symbolic/mupad_ref/solvelib-cartesianpower.html

I like your generalization using tuples.

lukaseder commented 9 years ago

Hmm, I mixed informal tuple and list notations, if that's what you meant....

danieldietrich commented 9 years ago

Oh, I just meant the API for cross-joining multiple Seqs - I realized that Javaslang is missing such API. It is not really a generalization but there is no other possibility to capture different generic component types.

lukaseder commented 9 years ago

Ah, you mean the implementation of jOOλ's static crossJoin() methods... Well, if you're strictly in maths world, cross join is not an associative operation: https://en.wikipedia.org/wiki/Cartesian_product#Non-commutativity_and_non-associativity

This means that it might not really make sense, mathematically, to "generalise" this over multiple seqs. In SQL, like all operations derived from cartesian products, cross join always flattens nested tuples into a single top-level tuple, which is generally more useful and user-friendly than the mathematical nested tuples. I'm a pragmatic SQL person, not a maths person :)

danieldietrich commented 9 years ago

I will stay with the name crossProduct to 'keep the API together'.

The most general form of a cartestian product (or cross product or simply product) is the n-ary cartesian product, e.g. X1 x X2 x ... x Xn = (x1, x2, ..., xn) with x1 ∈ X1, ..., xn ∈ Xn.

The cartesian power is a special case of the n-ary cartesian product, where the Xi = Xj for all i, j ∈ {1, ..., n}.

Here crossProduct(int n) denotes Xⁿ = X x X ... x X (n times).

To make it more clear I will name the parameter power: crossProduct(int power).

lukaseder commented 9 years ago

To make it more clear I will name the parameter power

Fair enough

danieldietrich commented 9 years ago

which is generally more useful and user-friendly than the mathematical nested tuples.

:+1:

I'm a pragmatic SQL person, not a maths person :)

we need pragmatic solutions and not 'in theory there exists a solution - I can prove it' :-)

danieldietrich commented 9 years ago

Another, relatively fast solution for the blog post puzzle using Javaslang:

// Example: solvePuzzle(CharSeq.rangeClosed('A', 'Z'), 2)
static <T> Stream<IndexedSeq<T>> solvePuzzle(TraversableOnce<T> alphabet, int n) {
    if (n < 0) {
        throw new IllegalArgumentException("negative n");
    }
    return alphabet.sliding(1).toStream()
            .appendSelf(stream -> stream.flatMap(product -> alphabet.map(product::append)))
            .take(sumOfPowers(alphabet.length(), n));
}

//  n
//  Σ kⁱ = k¹ + k² + ... + kⁿ
// i=1
private static int sumOfPowers(int k, int n) {
    int product = k;
    int sum = product;
    for (int i = 1; i < n; i++) {
        product *= k;
        sum += product;
    }
    return sum;
}
danieldietrich commented 9 years ago

Made some benchmarks.

Result:

alphabetic sequence using cartesian product bench(4) took 0.245 sec.
alphabetic sequence using append self bench(4) took 0.032 sec.

We see that appendSelf() is great in reusing results which where calculated before.

I think that the performance of appendSelf and cartesianPower should be nearly equal for a fixed dimension because no results need to be reduced.

Setup:

public class AlphabeticSequenceBench {

    private static final int COUNT = 4;
    private static final int WARMUP = 2;

    private static final Seq<Character> ALPHABET = CharSeq.rangeClosed('A', 'Z').toStream();

    public static void main(String[] args) {
        benchAlphabeticSequenceUsingCartesianPower();
        benchAlphabeticSequenceUsingAppendSelf();
    }

    static void benchAlphabeticSequenceUsingCartesianPower() {
        bench("alphabetic sequence using cartesian product", COUNT, WARMUP, AlphabeticSequenceBench::cartesianPower);
    }

    static void benchAlphabeticSequenceUsingAppendSelf() {
        bench("alphabetic sequence using append self", COUNT, WARMUP, AlphabeticSequenceBench::appendSelf);
    }

    private static List<String> cartesianPower(int n) {
        return Stream.from(1)
                .flatMap(ALPHABET::crossProduct)
                .map(IndexedSeq::mkString)
                .takeWhile(s -> s.length() <= n)
                .toList();
    }

    private static List<String> appendSelf(int n) {
        return ALPHABET.sliding(1).toStream()
                .appendSelf(stream -> stream.flatMap(product -> ALPHABET.map(product::append)))
                .map(IndexedSeq::mkString)
                .takeWhile(s -> s.length() <= n)
                .toList();
    }
}
danieldietrich commented 9 years ago

@lukaseder wanted to benchmark jOOL also, but it seems Seq.range*() and Seq.crossJoin() are not yet in Maven Central, right?

lukaseder commented 9 years ago

No, not yet.

danieldietrich commented 9 years ago

Ok. What I've seen so far is that java.util.stream.Stream is really fast. Looking forward to check how it performs.