Open calathea21 opened 2 years ago
@calathea21 great questions. Tagging @kbattocchi. Perhaps @kbattocchi may be able to answer some of the more complex questions related to causal analysis.
For the first question, this is saying setting race from Amer-Indian-Eskimo to Black on average would reduce the output of the classifier by 1.8 (e.g. for a particular datapoint this might go from 1 to less than 0, which doesn't completely make sense since the range is bounded, but we're extrapolating linearly from much smaller residual values to compute this quantity).
For the second question, because we're estimating a linear model, you can get those results via subtraction (changing the race from Black to White would have the same net effect as changing the race from Amer-Indian-Eskimo to White minus the effect of Amer-Indian-Eskimo to Black). If you want all of the results to use a different baseline, you can specify the explicit list of categories in the categories
argument when adding causal analysis to the model; this should be a list containing the categories for each categorical column, where you can just use 'auto' instead of an explicit set of categories for any column you don't need to explicitly order; if you do pass a list of categories for some column, the first category from the list will be used as the baseline against which all the others are compared.
For your final question, I believe at the moment these are generated using the test data only.
Thanks for your help @kbattocchi!
If you could help me with one more question, that'd be great: For me the difference between the "What If Counterfactuals" and the "Local Causal Effects" is not quite clear.
Say, I use the "What If Counterfactual" (the ones not part of the Causal Dashboard, but part of the Counterfactuals dashboard) to take datapoint x and change her gender from 'female' to 'male', to observe the effect on her prediction probability for a positive decision.
Now, say that I select this same datapoint x in the Individual Causal What-If" tab (the one part of the
Causal Analysis") and check the direct local causal effect of changing her gender from 'female' to 'male'.
Intuitively I would think that both operations should be the same, but is this true? Is there any difference in how to interpret the results?
I have several questions about the Causal Analysis tab within the RAI dashboard. I hope someone can help me with this :) I'm using this component, to analyse fairness on the Census Dataset. As treatment variables I've added 'Race' and 'Education'