daphne-eu / daphne

DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines
Apache License 2.0
67 stars 62 forks source link

Add 'group()' built-in function to DaphneDSL. #921

Open saminbassiri opened 2 days ago

saminbassiri commented 2 days ago

Add group() Built-in Function to DaphneDSL for Grouping and Aggregation

Description

This pull request introduces a new group() built-in function to DaphneDSL, enabling the creation of a GroupOp in DaphneIR and closes issue #903 .


Changes Implemented

  1. group() Built-in Function in DaphneDSL:

    • Interface: group(arg:frame, groupCols:str, ..., sumCol:str)
    • Accepts:
      • A frame as input.
      • An arbitrary number of columns to group on.
      • A single column to compute the sum.
    • Aggregation Support:
      • Only supports SUM as the aggregation function.
  2. Kernel Function Updates:

    • Updated order and extractCol kernel functions to process string values correctly.
    • Extended group kernel function to handle string values.
  3. Test Cases:

    • Added script-level tests to validate the functionality of the group() function in DaphneDSL.