PAIR-code / what-if-tool

Source code/webpage/demos for the What-If Tool
https://pair-code.github.io/what-if-tool
Apache License 2.0
896 stars 167 forks source link

Data split for binning - Datapoint editor vs. Performance & Fairness #53

Open timoei opened 4 years ago

timoei commented 4 years ago

Hi,

we really like to use the What-If tool. The last days we encountered that the split of the data between the datapoint editor and the performance and fairness tabs isn't performed in the same way. As an example, we binned the data of the UCI census income dataset by age in 10 bins. The number of data points in each bin for the datapoint editor and performance & fairness tabs can vary (s. figure). whatIf_bins

For us, it would be extremely helpful if the data in e.g. the first bin of the datapoint editor would be exactly the same as in the first bin of the performance and fairness tab.

Best, Timo

jameswex commented 4 years ago

Thanks so much for this feedback. You're correct that the binning logic in the Performance & Fairness tab is a little different than that in the datapoints display. This is due to the fact that the datapoints display uses the Facets Dive visualization which has its own binning logic outside of the What-If Tool code repository.

But, it should be possible for us to unify the binning logic between the two. This issue will track the work of unifying them.