databrickslabs / dlt-meta

Metadata driven Databricks Delta Live Tables framework for bronze/silver pipelines
https://databrickslabs.github.io/dlt-meta/
Other
148 stars 66 forks source link

How to add metadata column in onboarding file for bronze table #100

Closed UlrichMelende closed 1 day ago

UlrichMelende commented 3 days ago

Hello,

I want to load data from csv file to bronze table using dlt-meta. My question : is it possible to add metadata column which is not in the csv source file by adding it in onboarding file? If yes, how to do it.

ravi-databricks commented 3 days ago

There are two ways:

  1. use file metadata as shown here, refer to docs

For cloudFiles option _metadata columns addtiion there is source_metadata tag with attributes: include_autoloader_metadata_column flag (True or False value) will add _metadata column to target bronze dataframe, autoloader_metadata_col_name if this provided then will be used to rename _metadata to this value otherwise default is source_metadata,select_metadata_cols:{key:value} will be used to extract columns from _metadata. key is target dataframe column name and value is expression used to add column from _metadata column

  1. Use custom transformations shown here
UlrichMelende commented 1 day ago

Thank you, option2 works very well.