Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
When a column contains a SQL expression, if the SQL contains references to other columns that appear prior to the current column - adjust the build sequencing so that those columns are created first.
Current Behavior
You have to specify via baseColumn attribute in all cases
Improved behavior
If simple identifier parser detects valid SQL identifiers in the SQL expression and a) they are not inside a string, and b) they match an existing column name, use separate phases for generation of the column which references the other columns.
This will reduce the number of cases where it is necessary to explicitly reference the baseColumns
Expected Behavior
When a column contains a SQL expression, if the SQL contains references to other columns that appear prior to the current column - adjust the build sequencing so that those columns are created first.
Current Behavior
You have to specify via
baseColumn
attribute in all casesImproved behavior
If simple identifier parser detects valid SQL identifiers in the SQL expression and a) they are not inside a string, and b) they match an existing column name, use separate phases for generation of the column which references the other columns.
This will reduce the number of cases where it is necessary to explicitly reference the baseColumns