databrickslabs / remorph

Cross-compiler and Data Reconciler into Databricks Lakehouse
Other
37 stars 23 forks source link

[FEATURE]: Create an Upgrade script to handle changes in newer versions #769

Closed vijaypavann-db closed 1 month ago

vijaypavann-db commented 2 months ago

Is there an existing issue for this?

Category of feature request

Reconcile

Problem statement

A new column operation_name has been added to the main table.
The reconcililation process is failing to write the Data to existing main table because of the Schema changes mentioned above.

Please find the error below:

AnalysisException: A schema mismatch detected when writing to the Delta table (Table ID: ).
To enable schema migration using DataFrameWriter or DataStreamWriter, please set:
'.option("mergeSchema", "true")'.
For other operations, set the session configuration
spark.databricks.delta.schema.autoMerge.enabled to "true". See the documentation
specific to the operation for details.

Proposed Solution

To address these scenarios, we need to develop an Upgrade script that runs during Installation which contains detailed instructions for upgrading from one version to an upper version. for example: from 0.4.0 to 0.4.1.

Additional Context

Check for ways to automate the process while running reconcile. Current changes includes: Schema and Workflow