Closed jaared closed 1 year ago
The compare_mitigation_method() function uses the FAIR threshold which can be defined by the user. The score to evaluate the performance of the "mitigated" model is computed using that threshold.
Can you clarify the statement "... to select a new threshold for each model"? Do you mean computing the best threshold using the "discrimination_threshold" function for each model?
Yes. Each model is unique and will have an optimal decision threshold - which is probably different for each model. I believe using the discrimination_threshold() function would be appropriate.
Currently the compare_mitigation_methods() function seems to rely on a pre-defined threshold.
https://github.com/EqualityAI/EqualityML/blob/90e04435007a653b5c4c69dc5a9b86e0c5d34ce7/equalityml/fair.py#L714-L728
It is statistically more correct to select a new threshold for each model. This will require taking in from the user the decision_maker when calling compare_mitigation_methods() I don't know if this is urgent - as the current approach seems to give a good approximate result.
Also, recall that the threshold function uses random seed.