h2oai / db-benchmark

reproducible benchmark of database-like ops
https://h2oai.github.io/db-benchmark
Mozilla Public License 2.0
326 stars 88 forks source link

review dask group apply calls #86

Closed jangorecki closed 5 years ago

jangorecki commented 5 years ago

follow up of https://github.com/h2oai/db-benchmark/issues/81 now documented in https://github.com/dask/dask/pull/4800/files review q7, q8, q9 and use :class:dask.dataframe.groupby.Aggregation

jangorecki commented 5 years ago

q8 has to to use apply because it does not reduce to single row by group which seems to be required:

https://docs.dask.org/en/latest/dataframe-api.html#custom-aggregation

The index has to be equal to the groups.

attempt to use .Aggregation result in

ValueError: multiple levels only valid with MultiIndex
jangorecki commented 5 years ago

q9 is to be implemented https://github.com/dask/dask/issues/4828