databrickslabs / dbldatagen

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
https://databrickslabs.github.io/dbldatagen
Other
310 stars 58 forks source link

updates for test speed improvements #125

Closed ronanstokes-db closed 1 year ago

ronanstokes-db commented 1 year ago

Proposed changes

Reduces runtime for Github unit tests by 40 - 50%

Changes:

Types of changes

What types of changes does your code introduce to dbldatagen? Put an x in the boxes that apply

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

Further comments

Unit test speedup by taking advantage of pytest specific features

lgtm-com[bot] commented 1 year ago

This pull request introduces 1 alert when merging 790b694f18d31e1dd32c8d5cb4f40158dd446841 into 23a83f5ef768a55f9da8827876f763caafdb672b - view on LGTM.com

new alerts:

codecov[bot] commented 1 year ago

Codecov Report

Merging #125 (95f9cec) into master (d6b1799) will increase coverage by 0.10%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #125      +/-   ##
==========================================
+ Coverage   84.00%   84.11%   +0.10%     
==========================================
  Files          21       21              
  Lines        2132     2134       +2     
  Branches      365      365              
==========================================
+ Hits         1791     1795       +4     
+ Misses        243      242       -1     
+ Partials       98       97       -1     
Impacted Files Coverage Δ
dbldatagen/data_generator.py 82.68% <100.00%> (+0.37%) :arrow_up:
dbldatagen/spark_singleton.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

lgtm-com[bot] commented 1 year ago

This pull request introduces 1 alert when merging c0fad662d0e54149e286387b6acdb444a9470530 into 23a83f5ef768a55f9da8827876f763caafdb672b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 1 year ago

This pull request introduces 1 alert when merging 3e97ab5c44665c95e0d431f4599cd5d7880a425b into 23a83f5ef768a55f9da8827876f763caafdb672b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 1 year ago

This pull request introduces 5 alerts when merging 5968b503abfd6600fbeaaf7c39d890a0d7e7c478 into 23a83f5ef768a55f9da8827876f763caafdb672b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 1 year ago

This pull request fixes 1 alert when merging e1262447df639b20467e23559c77bb4ef221520d into d6b1799ecf5bcb3be9ac14d3366454da427f52c1 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 1 year ago

This pull request fixes 1 alert when merging 82845d34ade0927283a656b9511723f4ad6bfe63 into d6b1799ecf5bcb3be9ac14d3366454da427f52c1 - view on LGTM.com

fixed alerts:

ronanstokes-db commented 1 year ago

overall looks more or less ok. But I'm not sure if we need SparkSingleton at all...

Its existing functionality not new functionality introduced by PR - goal of PR is to speed up unit tests not change APIs

I looked at pyspark-test and it does not address the need to efficiently run tests irrespective of Github runner config

lgtm-com[bot] commented 1 year ago

This pull request fixes 1 alert when merging d4020425e5562e9b16472dd37a8186240a4528b2 into d6b1799ecf5bcb3be9ac14d3366454da427f52c1 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 1 year ago

This pull request fixes 1 alert when merging 80121717f50d972dfd3b7df74af7c6613c4673af into d6b1799ecf5bcb3be9ac14d3366454da427f52c1 - view on LGTM.com

fixed alerts:

lgtm-com[bot] commented 1 year ago

This pull request fixes 1 alert when merging 95f9cecc18db192106a18cdac70ef920d8947d41 into d6b1799ecf5bcb3be9ac14d3366454da427f52c1 - view on LGTM.com

fixed alerts: