databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
310 stars 58 forks source link

Feature upgrade to Spark 3.2.1 #111

Closed ronanstokes-db closed 1 year ago

ronanstokes-db commented 1 year ago

Proposed changes

Describe the big picture of your changes here to communicate to the maintainers. If it fixes a bug or resolves a feature request, please provide a link to that issue.

Types of changes

What types of changes does your code introduce to dbx? Put an x in the boxes that apply

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

Based supported version on Databricks runtime 9.1 LTS or later. May impact users of earlier runtime versions but instructions are included for using older version of the Databricks Data Generator in a notebook

ronanstokes-db commented 1 year ago

ready for review

codecov[bot] commented 1 year ago

Codecov Report

Merging #111 (e8eb833) into master (109707e) will increase coverage by 2.78%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #111      +/-   ##
==========================================
+ Coverage   84.11%   86.90%   +2.78%     
==========================================
  Files          21       21              
  Lines        2134     2161      +27     
  Branches      365      367       +2     
==========================================
+ Hits         1795     1878      +83     
+ Misses        242      183      -59     
- Partials       97      100       +3     
Impacted Files Coverage Δ
dbldatagen/_version.py 100.00% <100.00%> (ø)
dbldatagen/data_generator.py 83.23% <100.00%> (+0.54%) :arrow_up:
dbldatagen/datagen_constants.py 100.00% <100.00%> (ø)
dbldatagen/text_generators.py 79.77% <0.00%> (+21.37%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

ronanstokes-db commented 1 year ago

I also needed to add new code coverage tests as the code coverage threshold was failing - the code coverage tools now seem to discount code coverage for code in _init.py or else code coverage threshold has been raised.

ronanstokes-db commented 1 year ago

Do not commit directly as other fixes need to be commited first

lgtm-com[bot] commented 1 year ago

This pull request introduces 1 alert and fixes 2 when merging d3ebcf291c2cbee903bfa80c5f29ab258042c953 into 109707e9fdec3185688dce5a1d9a7f342cca069d - view on LGTM.com

new alerts:

fixed alerts:

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine :gear: that powers LGTM.com. For more information, please check out our post on the GitHub blog.

lgtm-com[bot] commented 1 year ago

This pull request introduces 1 alert when merging 85478f875aae4fd9be2f6611e98e372e07546288 into 109707e9fdec3185688dce5a1d9a7f342cca069d - view on LGTM.com

new alerts:

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine :gear: that powers LGTM.com. For more information, please check out our post on the GitHub blog.

ronanstokes-db commented 1 year ago

please resolve git conflicts, otherwise lgtm

All conflicts have been resolved

lgtm-com[bot] commented 1 year ago

This pull request introduces 1 alert when merging 3b04db4618870d2ca7532cbb831f72b3bf2b150f into 109707e9fdec3185688dce5a1d9a7f342cca069d - view on LGTM.com

new alerts:

Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine :gear: that powers LGTM.com. For more information, please check out our post on the GitHub blog.