hablapps / doric

Type safety for spark columns
https://www.hablapps.com/doric/
Apache License 2.0
77 stars 11 forks source link

Custom aggregation with doric syntax #312

Closed alfonsorr closed 1 year ago

alfonsorr commented 1 year ago

Description

This pull request allows the creation of aggregation functions with the syntax of doric.

Example of mean implemented with customAgg in doric

 val complexAggWithoutNames = customAgg[Long, Row, Double](
        col[Long]("id"),
        struct(lit(0L), lit(0L)),
        (x, y) =>
          struct(
            x.getChild[Long]("col1") + y,
            x.getChild[Long]("col2") + 1L.lit
          ),
        (x, y) =>
          struct(
            x.getChild[Long]("col1") + y.getChild[Long]("col1"),
            x.getChild[Long]("col2") + y.getChild[Long]("col2")
          ),
        x => x.getChild[Long]("col1") / x.getChild[Long]("col2")
      )

Related Issues and dependencies

How Has This Been Tested?

The test is developed with simple and complex zero values.

github-actions[bot] commented 1 year ago

:octocat: This is an auto-generated comment created by:

Actor Triggering actor Sender
eruizalo
eruizalo
eruizalo
eruizalo
eruizalo
eruizalo
Triggered by: - Workflow name: "CI" at .github/workflows/ci.yml - URL: [https://github.com/hablapps/doric/actions/runs/3994766992](https://github.com/hablapps/doric/actions/runs/3994766992) - on workflow_run:completed

Test summary report 📊

Spark version testing
2.4.1 588 passed, 2 skipped
2.4.2 588 passed, 2 skipped
2.4.3 588 passed, 2 skipped
2.4.4 588 passed, 2 skipped
2.4.5 588 passed, 2 skipped
2.4.6 589 passed, 2 skipped
2.4.7 589 passed, 2 skipped
2.4 589 passed, 2 skipped
3.0.0 621 passed, 2 skipped
3.0.1 621 passed, 2 skipped
3.0.2 621 passed, 2 skipped
3.0 621 passed, 2 skipped
3.1.0 649 passed, 2 skipped
3.1.1 649 passed, 2 skipped
3.1.2 649 passed, 2 skipped
3.1 649 passed, 2 skipped
3.2.0 653 passed, 2 skipped
3.2.1 653 passed, 2 skipped
3.2 653 passed, 2 skipped
3.3.0 653 passed, 2 skipped
3.3 653 passed, 2 skipped
codecov[bot] commented 1 year ago

Codecov Report

Merging #312 (a9308e7) into main (c78276a) will increase coverage by 0.07%. The diff coverage is 100.00%.

Additional details and impacted files [![Impacted file tree graph](https://codecov.io/gh/hablapps/doric/pull/312/graphs/tree.svg?width=650&height=150&src=pr&token=N7ZXUXZX1I&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps)](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps) ```diff @@ Coverage Diff @@ ## main #312 +/- ## ========================================== + Coverage 97.35% 97.42% +0.07% ========================================== Files 58 60 +2 Lines 1134 1163 +29 Branches 22 14 -8 ========================================== + Hits 1104 1133 +29 Misses 30 30 ``` | Flag | Coverage Δ | | |---|---|---| | spark-2.4.x | `94.69% <0.00%> (-0.19%)` | :arrow_down: | | spark-3.0.x | `96.48% <0.00%> (-0.18%)` | :arrow_down: | | spark-3.1.x | `97.29% <0.00%> (-0.18%)` | :arrow_down: | | spark-3.2.x | `97.53% <100.00%> (+0.06%)` | :arrow_up: | | spark-3.3.x | `?` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps) | Coverage Δ | | |---|---|---| | [...c/main/scala/doric/syntax/AggregationColumns.scala](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps#diff-Y29yZS9zcmMvbWFpbi9zY2FsYS9kb3JpYy9zeW50YXgvQWdncmVnYXRpb25Db2x1bW5zLnNjYWxh) | `100.00% <ø> (ø)` | | | [core/src/main/scala/doric/doric.scala](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps#diff-Y29yZS9zcmMvbWFpbi9zY2FsYS9kb3JpYy9kb3JpYy5zY2FsYQ==) | `94.74% <100.00%> (+0.62%)` | :arrow_up: | | [...3.2\_3.3/scala/doric/sqlExpressions/CustomAgg.scala](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps#diff-Y29yZS9zcmMvbWFpbi9zcGFya18zLjJfMy4zL3NjYWxhL2RvcmljL3NxbEV4cHJlc3Npb25zL0N1c3RvbUFnZy5zY2FsYQ==) | `100.00% <100.00%> (ø)` | | | [...\_3.3/scala/doric/syntax/AggregationColumns32.scala](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps#diff-Y29yZS9zcmMvbWFpbi9zcGFya18zLjJfMy4zL3NjYWxhL2RvcmljL3N5bnRheC9BZ2dyZWdhdGlvbkNvbHVtbnMzMi5zY2FsYQ==) | `100.00% <100.00%> (ø)` | | ------ [Continue to review full report at Codecov](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps). Last update [c78276a...a9308e7](https://codecov.io/gh/hablapps/doric/pull/312?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=hablapps).