Discussion about Adding a Baseline to Adila

Hamedloghmani commented 1 year ago

Hi, @hosseinfani @mahdis-saeedi Although this issue is not finalized and it is still under construction, the following are the options that I have came up with so far:

[x] 1) Implementing the fourth algorithm presented in Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search
I already did this option since it did not require much effort, to summarize the effect, fairness is improved, utility is dropped drastically ( it's better than other deterministic algorithms that we implemented already but still fails to hold on to utility)
[x] 2) one of the three algorithms proposed in Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search This paper provides three algorithms. i) Epsilon-greedy ii) Relevance-aware Swapping iii) Fair-greedy. Fair-greedy was their best algorithm based on the results but I was only able to find the implementation for Epsilon-greedy

Questions: a) Does our new baseline (e.g Epsilon-greedy) has to beat fa*ir ?

Hamedloghmani commented 12 months ago

Hi @hosseinfani, @mahdis-saeedi I implemented my first draft of the fair-greedy algorithm in section 2. I would really appreciate if you could kindly take a look and confirm the code does the operation mentioned in the algorithm. Thank you.


def fair_greedy(member_prob, att_list, prob_dist):
    L = list(zip(member_prob, att_list))
    print(L)
    R = [L[0]]  # Initialize R as [L1]
    print(len(L))
    for i in range(1, len(L)):
        flag = False
        p = calculate_att_dist(R)
        # To address comments after meeting with Mahdis
        p_diff = {False: p[False] - prob_dist[False] , True: p[True] - prob_dist[True]}

        while not flag:
            z_min = min(p_diff, key=p_diff.get)
            # Find the first item with the underrepresented feature
            for j in range(i, len(L)):
                if L[j][1] == z_min:  # Assuming each item in L has a 'feature' attribute
                    temp = L.pop(j)
                    # Move down the items in L to place the selected item at position i
                    L = L[:i] + [temp] + L[i:]
                    print(len(L))
                    R.append(temp)
                    flag = True
                    break
                # To avoid  infinite loop for the case there are not enough samples from the chosen protected att
                if j == len(L)-1:
                    flag = True
                    break
    return R

def calculate_att_dist(members):
    false_ratio = [second_item for _, second_item in members].count(False) / len(members)
    return {True: 1 - false_ratio, False: false_ratio}

if __name__ == "__main__":
    member_prob = [0.9, 0.8, 0.7, 0.6, 0.5]
    att_list = [False, False, False, True, True]
    x = fair_greedy(member_prob, att_list, {False: 0.6, True: 0.4})
    print(x)

Hamedloghmani commented 12 months ago

Issue: In the paper I was not able to navigate the specification of calculating the difference between probability distributions. I considered absolute difference for now. The algorithm is as follows: fairgreedy

mahdis-saeedi commented 12 months ago

@Hamedloghmani It seems that absolute value of differences does not work here because d_min = 0.

Hamedloghmani commented 12 months ago

@mahdis-saeedi I edited the code and removed abs() function

Hamedloghmani commented 11 months ago

Hi @hosseinfani , @mahdis-saeedi In the rest of this issue page, I will log my progress regarding running and integrating the code provided by the authors of Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search paper in different steps. I started with running fair_greedy algorithm with 3 different settings {'Woman': 0.4, 'Man': 0.6}, {'Woman': 0.2, 'Man': 0.8} and {'Woman': 0.1, 'Man': 0.9} on their synthetic dataset with 100 individuals and 0.4 female representation. Few notes so far:

Switching people in the ranking has two different methods in the code, can be 'switch' or 'move_down' but in the paper to the best of my knowledge only 'move_down' was presented.
We must have enough candidates for the distribution that we propose for this algorithm to be successful. For instance, in these examples, {'Woman': 0.2, 'Man': 0.8} and {'Woman': 0.1, 'Man': 0.9} with the same size as the input are not going to be successful since the whole data is 0.4 female and 0.6 male.
Also, something else that was new to me, was checking if we need reranking or not , which I could not initially see in the pseudo code in the paper

I have attached step by step traceback of these settings to this issue. output_synthetic_woman10_men90.txt output_synthetic_woman20_men80.txt output_synthetic_woman40_men60.txt

This progress is ongoing and this was only the first step.

Hamedloghmani commented 11 months ago

Hi, I tried to make a successful sample run of fair_greedy on 1 fold of our imdb dataset as well by transforming the input into the acceptable form by this function.

Through out tracing, one question was raised for me, since this method does not promise fairness in each prefix, would cutoff be statistically valid at a certain point (e.g 100) ?

I have attached step by step traceback of my sample run to this issue. sample_imdb_dp.txt

Hamedloghmani commented 4 months ago

Hi @hosseinfani , @mahdis-saeedi , Since it has been a while from our last feature update, I'll go over a short summary. In the last update, I added the baseline from Has CEO Gender Bias Really Been Fixed? Adversarial Attacking and Improving Gender Fairness in Image Search ,namely fair_greedy to our codebase. Please note that the code we obtained from the developer has not been pushed to the repo yet due to privacy reasons. The issue was the runtime ,that made experiments on uspt and dblp take months (if not interrupted). Hence, I started working on the optimization recently. Here is the summary so far:

I went over the driver code that I have wrote for transforming our data into the required parameters for fair_greedy, it is efficient and takes les than 2-3 seconds.
I narrowed down the optimization to their fairness_greedy function and still working on that.
During the previous step, I came up with an idea. every time this method wants to rerank, it goes over the whole list multiple time based on their algorithm. So, what if in main.py, instead of passing the whole list of experts, we pass the best let's say 500 experts to it while k=100 since beyond that point utility is low anyways. I tried it this way and the runtime becomes reasonable. I can start running the experiments using this solution if you also think it makes sense.

I would love to hear your thoughts. Thank you.

fani-lab / Adila

Discussion about Adding a Baseline to Adila #85