TimelyDataflow / differential-dataflow

An implementation of differential dataflow using timely dataflow on Rust.
MIT License
2.54k stars 183 forks source link

Add more `tuple_implementation`. #393

Open nooberfsh opened 1 year ago

nooberfsh commented 1 year ago

Hi, I'm translating tpch queries into dd dataflows, query01 has a lot of aggregations:

select
  l_returnflag,
  l_linestatus,
  sum(l_quantity) as sum_qty,
  sum(l_extendedprice) as sum_base_price,
  sum(
    l_extendedprice *(1 - l_discount)
  ) as sum_disc_price,
  sum(
    l_extendedprice *(1 - l_discount)*(1 + l_tax)
  ) as sum_charge,
  avg(l_quantity) as avg_qty,
  avg(l_extendedprice) as avg_price,
  avg(l_discount) as avg_disc,
  count(*) as count_order
from ...

the corresponding dataflow:

    input
        .explode(move |li| {
            if li.ship_date <= date {
                Some((key(&li), (
                    li.quantity,
                    li.extended_price,
                    li.extended_price * (one - li.discount),
                    li.extended_price * (one - li.discount) * (one + li.tax),
                    1
                )))
            } else {
                None
            }
        })

the above code doesn't compile due to dd only support tuple Semigroup with arity up to 4, I increase it to 12 which makes writing queries like above easier.