Closed pplonski closed 1 year ago
Hi @pplonski, I would be interested in contributing to this. Is there any way I can help ?
Hi @codeboy5!
Thank you for your offer! I started to study FairAutoML more, I don't like the approach from the paper. It looks good in theory, but for real-life problems might be unusable. Just imagine mitigating unfairness when doing 10-fold Cross-Validation. I think that applying Exponentiated Gradient to each model in each fold might be very inefficient. I found that Exponentiated Griadent has trouble optimizing for more than 1 sensitive feature, for example, if you have 2 sensitive features (A and B), then a mitigated model might be fair for feature A but unfair for feature B...
I would like to have a method that will search for a sample weight that will provide fairness. Then I would like to reuse the same sample weight when doing hyperparameters search.
So I'm in the process of searching for a method for fair-optimal same weighting...
@codeboy5 do you have experience in fair ML or in optimization theory?
I created a fairness
module. It can compute fairness metrics and plots for binary classification tasks. It compute statistics for every sensitive feature separately.
Link to the module: https://github.com/mljar/mljar-supervised/tree/fairness/supervised/fairness
Example script that compute fairness metrics: https://github.com/mljar/mljar-supervised/blob/fairness/examples/scripts/binary_classifier_adult_fairness.py
The API:
automl = AutoML(algorithms=["Xgboost"])
automl.fit(X_train, y_train, sensitive_features=sensitive_features_train)
TODO:
Improvements:
fairness_metric
, fariness_threshold
and protected_groups
in the API,fairness_metric
and fairness_threshold
,fairness_threshold
Questions:
protected_groups
and unprotected_groups
?Demo:
I think there should be at least 20 samples of same group to be considered in fairness mitigation. For example, we have a group defined by "Female", "Young<30", "Black", and there are only 5 samples for this group (0 samples with class 1). This group shouldn't be considered for computing fairness metrics, and shouldn't be considered for fairness mitigation.
I've pushed the work in progress
version for fairness mitigation. There are a lot of prints in the terminal - it is working version.
The algorithm is optimizing demographic parity ratio (it is hard coded). The output for mitigation for single feature (sex
):
The output for mitigation for two features (sex
, is_young
), is_young
is categrocial feature craeted from age<50
:
TODO:
privileged_groups
and unprivileged_groups
in the APIprivileged
and underprivileged
groups provided by user in the optimization stepunderprivileged_groups
not so random
optimization stepdemographic_parity_ratio
If feature is not categorical it is automatically converted into binary feature based on equal samples number in each bin. We print the information in the terminal, example:
Sensitive features should be categorical
Apply automatic binarization for feature age
New values ['(37.0, 90.0]', '(16.999, 37.0]'] for feature age are applied
The weights optimization stop condition is not yet implemented. This gives interesting behavior of algorithm. I was running the algorithm with privileged_groups
and unprivileged_groups
provided in API and DP Ratio goes above 1.0.
Please notice that below script is using two sensitive features sex
and age
. The privileged group is defined only for sex
feature, and for this feature ratio is going above 1.0 (because there is no stop condition).
The age
is passed as continuous features that is automatically converted into binary.
import pandas as pd
from sklearn.model_selection import train_test_split
from supervised.automl import AutoML
from sklearn.datasets import fetch_openml
data = fetch_openml(data_id=1590, as_frame=True)
y = (data.target == ">50K") * 1
X = data.data
y = (data.target == ">50K") * 1
sensitive_features = X[["sex", "age"]]
X_train, X_test, y_train, y_test, S_train, S_test = train_test_split(
X, y, sensitive_features, stratify=y, test_size=0.5, random_state=42
)
automl = AutoML(algorithms=["Xgboost"],
train_ensemble=False,
fairness_metric="demographic_parity_ratio", #
fairness_threshold=0.8,
privileged_groups = [{"sex": "Male"}],
unprivileged_groups = [{"sex": "Female"}]
)
automl.fit(X_train, y_train, sensitive_features=S_train)
Output:
I've run the AutoML with several algorithms on Adults dataset with two sensitive features. Below is an output from example script:
Issues:
Can we disable the Fairnes Metric?
Hi @mosaikme,
fairness is only used when you pass sensitive_features
in fit()
otherwise it is skipped.
Implement AutoML fairness based on https://arxiv.org/abs/2111.06495 by @qingyun-wu and @sonichi
Requirements:
Example code