databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
302 stars 57 forks source link

Add support for constraints #224

Closed ronanstokes-db closed 1 month ago

ronanstokes-db commented 1 year ago

Expected Behavior

Allows specification of constraint on generated data using withConstraint method on DataGenerator class

Current Behavior

Constraints can be added using where clauses on generated data set. However use of a withConstraint class would isolate the specification of the constraint from its implementation allowing for better implementations in the future.

It would also simplify the management of constraints and make the constraint mechanism more similar to Delta Live Table expectations.

Steps to Reproduce (for bugs)

New feature - not a bug

Context

Your Environment