activeviam / par-student-spark-atoti

Project with students from CentraleSupelec to explore Spark API in order to power atoti with Spark
1 stars 0 forks source link

Aggregation API #4

Closed OPeyrusse closed 2 years ago

OPeyrusse commented 2 years ago

To complete the API for the basic capabilities of the API to develop for this prototype, this PR adds the aggregation part. This resembles to a SQL query using a group-by and some aggregation functions. See the following Java query and this equivalent SQL:

AggregateQuery.aggregate(
       dataframe,
       List.of("id"),
       List.of(new Count("c"), new Sum("s", "value")),
       new EqualCondition("label", "a"));

can be translated in SQL to

SELECT id, count(*) as c, SUM(value) as s
FROM <table>
WHERE label = "a"
GROUP BY id

If the definition is clear enough to you, I will integrate it.


You can safely ignore the commit 03e6853aa5b36dc5932fdb812345a1809c357ae6 that only formats existing classes.

arnaudframmery commented 2 years ago

Because we need conditions in aggregate fonction I am going to rebase this branch on list-query

OPeyrusse commented 2 years ago

I let you deal with the merge conflicts before merging.