databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
326 stars 60 forks source link

Doesn't work with Databricks Serverless Instances. #295

Open danielp-db opened 2 months ago

danielp-db commented 2 months ago

Expected Behavior

DBLDATAGEN should be able to run with Databricks Serverless Instances as spark.sql.execution.arrow.pyspark.enabled is set by default.

Current Behavior

Module fails as it checks for the value of the configuration. image

Steps to Reproduce (for bugs)

Connect a notebook to a Databricks Serverless Instance. Import dbldatagen Try creating a spec with DataGenerator.

Context

N/A

Your Environment

Databricks Serverless Instance

ronanstokes-db commented 2 months ago

Thanks Daniel - this is a current open issue and will be in hotfix or next version

ronanstokes-db commented 2 months ago

See #292