Closed junholee6a closed 8 months ago
Taking a gander at these today -- thanks for the patience, @junholee6a!
@junholee6a just one comment on this from @ksneab7. We opted for this PR over the alternative (closed now). Let us know if you have any questions.
We are trying to release 0.10.8
soon and I'd like this to be part of it -- so the sooner the better on this PR. Thanks!
Thanks @junholee6a -- merged
NOTE: This is an alternative of PR https://github.com/capitalone/DataProfiler/pull/1073. If this is merged, then close PR https://github.com/capitalone/DataProfiler/pull/1073
Issue: https://github.com/capitalone/DataProfiler/issues/820
This is a necessary step to resolving issue https://github.com/capitalone/DataProfiler/issues/820. Previously,
_assimilate_histogram()
calledself
to decide whether the given histogram contained integers or floats, and rounded the bins for histograms that only contained integers._assimilate_histogram()
is called in exactly two places:_regenerate_histogram()
and_add_helper_merge_profile_histograms()
. To remove the dependency onself
, we can move the code for rounding bin edges to_regenerate_histograms()
and add an argument indicating whether the given histogram contains integers or floats.While
_regenerate_histogram()
should behave exactly the same as before, the histogram loss calculation in_add_helper_merge_profile_histograms()
may change slightly since we no longer round its histogram bins.