Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Trying to create a regular dataset with a text column, it throws this error. Other type of columns work fine.
I think AWS Emr serverless by default is using newer versions of numpy which is not compatible with dbldatagen.
Expected Behavior
Should work without error
Current Behavior
Getting the following error
Steps to Reproduce (for bugs)
Install dbldatagen using
pip install dbldatagen
Generate a custom dataset with a text generator column
Context
Trying to create a regular dataset with a text column, it throws this error. Other type of columns work fine. I think AWS Emr serverless by default is using newer versions of numpy which is not compatible with dbldatagen.
Your Environment
dbldatagen
version used: 0.4.0