AbsaOSS / spline-spark-agent

Spline agent for Apache Spark
https://absaoss.github.io/spline/
Apache License 2.0
176 stars 92 forks source link

Spline support for expand operation #814

Open liujiawen opened 1 year ago

liujiawen commented 1 year ago

Background

The Spline agent does not support column-level lineage tracking in expand operations. So, if my sql contains "grouping", or "with cube", the column lineage will be lost after expand.

Feature

Add OperationNodeBuilder to deal with spark expand operation.

Example [Optional]

image

Proposed Solution [Optional]

Solution Ideas

  1. I add a ExpandNodeBuilder class in package za.co.absa.spline.harvester.builder.plan, which has almost the same code as UnionNodeBuilder
  2. I modified the class OperationNodeBuilderFactory, to deal with expand operators.