capitalone / DataProfiler

What's in your data? Extract schema, statistics and entities from datasets
https://capitalone.github.io/DataProfiler
Apache License 2.0
1.42k stars 158 forks source link

feat: compute data type profile diff #1077

Closed scottiegarcia closed 9 months ago

scottiegarcia commented 9 months ago

Details

Profile differences weren't being computed for data_type_stats profiles. I discovered this because psi was only being computed for the categorical profiler. Simply am adding in a line to compute that.

Additionally, dataprofiler/profilers/text_column_profile.py was using the numeric stats mix in to compute diff simply to double check that the data type was the same. I've updated that to use the BaseColumnProfiler diff instead.

scottiegarcia commented 9 months ago

Closing as the issue was due to something on my end with the correct data type not being propagated