databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
309 stars 58 forks source link

Support generation of test data from spark schema with nested structures #296

Open adamski201 opened 1 month ago

adamski201 commented 1 month ago

Currently from my understanding automatic generation of test data directly from a spark schema alone is possible but only for simple (flat) structures. Maps, arrays, and structs are not supported.

Is there any plans to support generation of test data with more complex structures? Such a feature would be incredibly useful in allowing QAs to generate test data templates without any programming experience.