databrickslabs / dlt-meta

Metadata driven Databricks Delta Live Tables framework for bronze/silver pipelines
https://databrickslabs.github.io/dlt-meta/
Other
156 stars 71 forks source link

Metadata Information in Bronze append #121

Open DivyanshuSati007 opened 6 days ago

DivyanshuSati007 commented 6 days ago

I have a scenario where in bronze table i am inserting the data with metadata columns _metadata.file_name and _metadata.file_path, but i now have to append the historic data into this bronze table , but i am not able to fetch _metadata.file_path and modification_file_name. Is it possible to do so. if yes please tell me the json.

eg: this below json fails when we run it. is it possibel to include "source_metadata": { } in bronze_append_flows as shown in code.

"bronze_append_flows": [ { "name": "customer_bronze_flow", "create_streaming_table": false, "source_format": "cloudFiles", "source_details": { "source_path_it": "{dbfs_path}/integration_tests/resources/data/customers_af", "source_schema_path": "{dbfs_path}/integration_tests/resources/customers.ddl",

this part will work or not


            "source_metadata": {
                  "include_autoloader_metadata_column": "True",
                  "autoloader_metadata_col_name": "source_metadata",
                  "select_metadata_cols": {
                     "input_file_name": "_metadata.file_name",
                     "input_file_path": "_metadata.file_path"
              }
        },

        "reader_options": {
           "cloudFiles.format": "json",
           "cloudFiles.inferColumnTypes": "true",
           "cloudFiles.rescuedDataColumn": "_rescued_data"
        },
        "once": false
  }

],

ravi-databricks commented 21 hours ago

Need to add support for metdata columns! will add support for this in coming release