fairnessforensics / wiggum

simpson's paradox inspired fairness forensics
https://fairnessforensics.github.io/wiggum/
MIT License
5 stars 3 forks source link

Distance heatmap overview #112

Closed Shine226 closed 4 years ago

Shine226 commented 4 years ago

Done: Overview of distance heatmap for aggregate versus subgroup. Left: Pairwise comparison, detail view implementation, intersection categorical attributes.

brownsarahm commented 4 years ago

intersectional categoricla and pairwise should be a separate PR, don't put those on this one. The intersectional could be added without this being there.

What do you mean detail view implementation, this PR shouldn't change how detail views work, so I don't understand. Can you please describe in detail, why that needs to change.

Shine226 commented 4 years ago

I need to redesign the passing parameters for the detail views. Old version, scatterplot and slope graph get different data when clicking on the cells, but now we just have one type of matrix which is distance matrix and I need to think about how to pass data from the cell in distance matrix to detail view.

Shine226 commented 4 years ago

For example: Scatterplot : var vars = { x: d.colVar, y: d.rowVar, z: d.value, categoryAttr: d.categoryAttr, category: d.category, trend_type: d.trend_type}; Slopegraph: var vars = { x: d.colVar, left: d.start, right: d.end, keyName: d.keyName, index: d.index, protectedAttr: d.protectedAttr, weightingAttr: d.weightingAttr, targetAttr: d.targetAttr, target_var_type: d.target_var_type, slopeKey: d.slopeKey};

brownsarahm commented 4 years ago

where are these functions? are they python or js?

From a cell of the distance heatmap, you can reconstruct the information needed for either plot ( the feat1,feat2, groupby, subgroup). Then with those and the labeled_df you can get the rest of the info.

Shine226 commented 4 years ago

They are in js. For scatterplot, we just need to prepare one dataset for detail view. For different rank-trend, we need to prepare different subdatasets for detail view.

Shine226 commented 4 years ago

In models.py, Regression:

        result_dict = {'trend_type' : 'pearson_corr',
                        'categoricalVars': categoricalVars, 
                        'continousVars': regression_vars, 
                        'corrAll': corrAll.to_json(),
                        'groupby_info': groupby_info,
                        'corrSubs': [corrSub.to_json() for corrSub in correlationMatrixSubgroups]}

Rank Trend:

            result_dict = {'trend_type' : 'rank_trend',
                        'protectedVars': protectedVars,
                        'explanaryVars': groupbyAttrs.tolist(), 
                        'targetAttr': targetAttr,
                        'target_var_type': target_var_type,
                        'weighting_var': weighting_var,
                        'ratioRateAll': ratioRateAll,
                        'rateAll':[eachRateAll.to_json() for eachRateAll in rateAll],
                        'ratioSubs': [ratioSub.to_json() for ratioSub in ratioRateSub],
                        'rateSubs': [eachRateSub.to_json() for eachRateSub in rateSub]}

We also need to prepare for the data.

Shine226 commented 4 years ago

Actually, some computation are duplicated. you @brownsarahm did once in the wiggum python library, then I also did similar computation in the models.py for visualization.