ekzhang / crepe

Datalog compiler embedded in Rust as a procedural macro
Apache License 2.0
453 stars 16 forks source link

Implement aggregates #10

Open cuddlefishie opened 3 years ago

cuddlefishie commented 3 years ago

It would be nice to have aggregates in crepe. I am trying to do Advent of Code this year with as much crepe as I can and I ran into some situation where having them would have been nice.

The small set of aggregates that Soufflé has seems already really nice and powerful.

I am still learning to use Datalog, so I am very green when it comes to the implementation that crepe uses, but in general I would still be up to contribute this with some guidance. :upside_down_face:

ekzhang commented 3 years ago

That sounds interesting. What do you think would be a good syntax for aggregates?

crepe! {
    @input
    struct A(i32, i32);

    @output
    struct B(i32, i32);

    B(x, $syntax_for_max$ A(x, ?)) <- A(x, _);
}

Maybe something like the above? I think the only limitation is that it cannot be a valid expression syntax.

cuddlefishie commented 3 years ago

My initial idea was to allow them in the same place as relations as clauses but with lowercase identifiers.

Special binding syntax

One issue with that would be though that it's hard to bind the result of that to a variable. However, since all those aggregates return numeric values (ignoring the range aggregate from Soufflé) there is no need for more complex pattern matching, so maybe n = sum(x) { <clause>,* } could work nicely?

Your example then could look like this:

crepe! {
    @input
    struct A(i32, i32);

    @output
    struct B(i32, i32);

    B(x, n) <- n = max(x) { A(x, _) };
}

Pro:

Con:

Out parameters

I am not super convinced about starting with an assignment because that might be confusing when to use let n = and when to use n =, so maybe a Prolog style "out parameter" could work too. The n = max(x) { A(x, _) } would become max(x, n) { A(x, _) }. That would be clear to parse but maybe not as clear to read, because max(x, n) almost looks like the max of x and n will be taken.

crepe! {
    @input
    struct A(i32, i32);

    @output
    struct B(i32, i32);

    B(x, n) <- max(x, n) { A(x, _) };
}

Pro:

Con:

Special casing let rhs expressions

Maybe to keep the syntax "clean" the right hand side expression for let bindings could be special cased for aggregates. If the { } are required then this could be used to still allow user functions called like aggregates.

// aggregate
let n = max(x) { A(x, _) },
// user defined functions `max`
let n = max(x),

Pro:

Con: