databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
302 stars 57 forks source link

Feature text generation extensions #214

Open ronanstokes-db opened 1 year ago

ronanstokes-db commented 1 year ago

Proposed changes

Extend text generation with enhancements:

Types of changes

What types of changes does your code introduce to dbldatagen? Put an x in the boxes that apply

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc...

codecov[bot] commented 1 year ago

Codecov Report

Attention: 26 lines in your changes are missing coverage. Please review.

Comparison is base (1c8b340) 92.19% compared to head (6e41061) 91.59%. Report is 3 commits behind head on master.

Files Patch % Lines
dbldatagen/text_generatestring.py 65.33% 23 Missing and 3 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #214 +/- ## ========================================== - Coverage 92.19% 91.59% -0.61% ========================================== Files 23 24 +1 Lines 2754 2842 +88 Branches 471 487 +16 ========================================== + Hits 2539 2603 +64 - Misses 128 151 +23 - Partials 87 88 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

CLAassistant commented 9 months ago

CLA assistant check
All committers have signed the CLA.