databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
313 stars 59 forks source link

Feature standard datasets #269

Closed ghanse closed 4 months ago

ghanse commented 5 months ago

Proposed changes

Added a few standard datasets:

Types of changes

What types of changes does your code introduce to dbldatagen? Put an x in the boxes that apply

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

I have added several standard datasets along with unit tests where appropriate.

CLAassistant commented 5 months ago

CLA assistant check
All committers have signed the CLA.

ronanstokes-db commented 4 months ago

I'll review this and provide feedback after the initial merge of the feature