starlake-ai / starlake

Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
http://starlake.ai/
Apache License 2.0
57 stars 22 forks source link

feat: make comet_input_file_name in bigquery native load as precise as in spark when files are grouped #940

Closed tiboun closed 5 months ago

tiboun commented 5 months ago

When SL_GROUPED is set to true, comet_input_file_name is set to a list of files in bigquery native load because all files are loaded into the same target table. This is not the same behavior with spark loading, where spark links data to the original source file.

This PR aims to fill the gap between spark and bq native load.