capitalone / DataProfiler

What's in your data? Extract schema, statistics and entities from datasets
https://capitalone.github.io/DataProfiler
Apache License 2.0
1.42k stars 158 forks source link

Refactor `_assimilate_histogram` and `_regenerate_histogram` #1022

Closed junholee6a closed 11 months ago

junholee6a commented 1 year ago

Issue: https://github.com/capitalone/DataProfiler/issues/820

The functions _assimilate_histogram and _regenerate_histogram are refactored as directed in the issue. Additionally, _assimilate_histogram no longer rounds bin edges, as this feature doesn't seem necessary and adds an additional argument to _assimilate_histogram.

junholee6a commented 1 year ago

I didn't add unit tests for these functions here because 1) the functions had no direct unit tests in the first place so code coverage won't change, and 2) _assimilate_histogram will be re-written in issue https://github.com/capitalone/DataProfiler/issues/1017

junholee6a commented 1 year ago

Seems like there was a mypy error in the Python 3.8 tests. Will look into that

taylorfturner commented 1 year ago

Seems like there was a mypy error in the Python 3.8 tests. Will look into that

thanks!

taylorfturner commented 11 months ago

Had to rebuild some branches post 0.10.5 release. Please re-open and change base to dev (i.e. the branch you are merging into) @junholee6a