databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
364 stars 61 forks source link

Support generation of test data from spark schema with nested structures #296

Open adamski201 opened 4 months ago

adamski201 commented 4 months ago

Currently from my understanding automatic generation of test data directly from a spark schema alone is possible but only for simple (flat) structures. Maps, arrays, and structs are not supported.

Is there any plans to support generation of test data with more complex structures? Such a feature would be incredibly useful in allowing QAs to generate test data templates without any programming experience.