fani-lab / Adila

Fairness-Aware Team Formation
3 stars 2 forks source link

Discussion about Adding a Baseline to Adila #85

Open Hamedloghmani opened 11 months ago

Hamedloghmani commented 11 months ago

Hi, @hosseinfani @mahdis-saeedi Although this issue is not finalized and it is still under construction, the following are the options that I have came up with so far:

Questions: a) Does our new baseline (e.g Epsilon-greedy) has to beat fa*ir ?

Hamedloghmani commented 10 months ago

Hi @hosseinfani, @mahdis-saeedi I implemented my first draft of the fair-greedy algorithm in section 2. I would really appreciate if you could kindly take a look and confirm the code does the operation mentioned in the algorithm. Thank you.


def fair_greedy(member_prob, att_list, prob_dist):
    L = list(zip(member_prob, att_list))
    print(L)
    R = [L[0]]  # Initialize R as [L1]
    print(len(L))
    for i in range(1, len(L)):
        flag = False
        p = calculate_att_dist(R)
        # To address comments after meeting with Mahdis
        p_diff = {False: p[False] - prob_dist[False] , True: p[True] - prob_dist[True]}

        while not flag:
            z_min = min(p_diff, key=p_diff.get)
            # Find the first item with the underrepresented feature
            for j in range(i, len(L)):
                if L[j][1] == z_min:  # Assuming each item in L has a 'feature' attribute
                    temp = L.pop(j)
                    # Move down the items in L to place the selected item at position i
                    L = L[:i] + [temp] + L[i:]
                    print(len(L))
                    R.append(temp)
                    flag = True
                    break
                # To avoid  infinite loop for the case there are not enough samples from the chosen protected att
                if j == len(L)-1:
                    flag = True
                    break
    return R

def calculate_att_dist(members):
    false_ratio = [second_item for _, second_item in members].count(False) / len(members)
    return {True: 1 - false_ratio, False: false_ratio}

if __name__ == "__main__":
    member_prob = [0.9, 0.8, 0.7, 0.6, 0.5]
    att_list = [False, False, False, True, True]
    x = fair_greedy(member_prob, att_list, {False: 0.6, True: 0.4})
    print(x)
Hamedloghmani commented 10 months ago

Issue: In the paper I was not able to navigate the specification of calculating the difference between probability distributions. I considered absolute difference for now. The algorithm is as follows: fairgreedy

mahdis-saeedi commented 10 months ago

@Hamedloghmani It seems that absolute value of differences does not work here because d_min = 0.

Hamedloghmani commented 10 months ago

@mahdis-saeedi I edited the code and removed abs() function

Hamedloghmani commented 9 months ago

Hi @hosseinfani , @mahdis-saeedi In the rest of this issue page, I will log my progress regarding running and integrating the code provided by the authors of Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search paper in different steps. I started with running fair_greedy algorithm with 3 different settings {'Woman': 0.4, 'Man': 0.6}, {'Woman': 0.2, 'Man': 0.8} and {'Woman': 0.1, 'Man': 0.9} on their synthetic dataset with 100 individuals and 0.4 female representation. Few notes so far:

I have attached step by step traceback of these settings to this issue. output_synthetic_woman10_men90.txt output_synthetic_woman20_men80.txt output_synthetic_woman40_men60.txt

This progress is ongoing and this was only the first step.

Hamedloghmani commented 9 months ago

Hi, I tried to make a successful sample run of fair_greedy on 1 fold of our imdb dataset as well by transforming the input into the acceptable form by this function.

I have attached step by step traceback of my sample run to this issue. sample_imdb_dp.txt

Hamedloghmani commented 2 months ago

Hi @hosseinfani , @mahdis-saeedi , Since it has been a while from our last feature update, I'll go over a short summary. In the last update, I added the baseline from Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search ,namely fair_greedy to our codebase. Please note that the code we obtained from the developer has not been pushed to the repo yet due to privacy reasons. The issue was the runtime ,that made experiments on uspt and dblp take months (if not interrupted). Hence, I started working on the optimization recently. Here is the summary so far:

I would love to hear your thoughts. Thank you.