EqualityAI / EqualityML

Evidence-based tools and community collaboration to end algorithmic bias, one data scientist at a time.
Apache License 2.0
34 stars 3 forks source link

Independent thresholds for each method when using compare_mitigation_methods #15

Closed jaared closed 1 year ago

jaared commented 1 year ago

Currently the compare_mitigation_methods() function seems to rely on a pre-defined threshold.

https://github.com/EqualityAI/EqualityML/blob/90e04435007a653b5c4c69dc5a9b86e0c5d34ce7/equalityml/fair.py#L714-L728

It is statistically more correct to select a new threshold for each model. This will require taking in from the user the decision_maker when calling compare_mitigation_methods() I don't know if this is urgent - as the current approach seems to give a good approximate result.

Also, recall that the threshold function uses random seed.

JoaoGranja commented 1 year ago

The compare_mitigation_method() function uses the FAIR threshold which can be defined by the user. The score to evaluate the performance of the "mitigated" model is computed using that threshold.

Can you clarify the statement "... to select a new threshold for each model"? Do you mean computing the best threshold using the "discrimination_threshold" function for each model?

jaared commented 1 year ago

Yes. Each model is unique and will have an optimal decision threshold - which is probably different for each model. I believe using the discrimination_threshold() function would be appropriate.