tensorflow / model-remediation

Model Remediation is a library that provides solutions for machine learning practitioners working to create and train models in a way that reduces or eliminates user harm resulting from underlying performance biases.
https://www.tensorflow.org/responsible_ai/model_remediation?hl=en
Apache License 2.0
42 stars 19 forks source link

[GSOC]: Active Sampling for Min-Max Fairness #29

Open pranjal-awasthi opened 2 years ago

pranjal-awasthi commented 2 years ago

Project Description: Min-max fairness is a natural and desirable notion of subgroup fairness. The goal of this project is to develop open source implementations of recent research into label efficient active sampling algorithms for achieving min-max fairness. Furthermore the project will involve benchmarking the algorithms against baselines, demonstrating fairness benefits on several publicly available datasets and exploring the use of accelerated gradient descent based techniques for making the algorithms more efficient.

Towards the above goal we encourage participants to first complete the following starter tasks.

Note: The following tasks are open for multiple submissions. Anyone interested in the project is encouraged so solve the tasks below and submit their work.

Starter Task 1: Train a one layer fully connected network in tensorflow to solve the binary classification problem of income prediction for the UCI Adult Dataset. Choose a reasonable batch size (say 32 or 64) and a reasonable number of epochs (say 50). Plot the test accuracy as a function of the number of processed batches. Also plot, as a function of the number of processed batches, the performance gap in the model accuracy for points with sex=Male vs. points with sex=Female.

Starter Task 2: Partition the training set into two subsets S1 and S2 based on sex. Retrain the network with the same settings as in Task 1. However this time alternate batch selection between S1 and S2. In other words if the first batch of 32 or 64 examples is chosen randomly from S1 then the next batch is chosen from S2..and so on. Report the same metrics as in Task 1. Summarize your results and suggest one or two ways to reduce the performance gap in the model accuracy for points with sex=Male vs. points with sex=Female.

aryan1107 commented 2 years ago

I have done starter task 1 can you check gitter message please Professor? Also I have also sent you my notebook for feedback. Or do you want me to post my notebook here?

Update here is my notebook link: https://colab.research.google.com/drive/1imPLMcvalmGErsZO8-BhSUDI03hWNv6u?usp=sharing

Update: I was asked to form preliminary project proposal. Thanks Pranjal! 🙏

Screen Shot 2022-04-19 at 10 02 49 PM
kishoreKunisetty commented 2 years ago

Hello, @pranjal-awasthi. I have completed the given starter tasks.

please have a look at my colab submission here: https://colab.research.google.com/drive/1W7wwbaVDNc9yIGhYxKFvMrAXHeKa98pP?usp=sharing

Please suggest any improvements if needed. Thank you.

bhaktipriya commented 2 years ago

Thanks a lot for working on the tasks! Pranjal and Jenny will take a look.

Note to future applicants: Please use a different public dataset of your choice instead of the UCI adult dataset to work on this task.

gaalocastillo commented 2 years ago

Hello @bhaktipriya ! I used the COMPAS Recidivism Racial Bias dataset to develop these tasks. Please let me know if it is fine or I should use another dataset. Anyway I share my notebook with the tasks developed for now.

@pranjal-awasthi please take a look at my notebook with the requested tasks for this project and let me know any suggestions for improvement: https://colab.research.google.com/drive/1BaFISQOc8abCIJ-elHWoO88iuQFDDa10

Thank you all!

dimoibiehg commented 2 years ago

Hello @pranjal-awasthi and @bhaktipriya, please find my notebook in the following link: https://colab.research.google.com/drive/1yS7nBIluesRJ8Bp1nz3tk2r7X9nKjqaQ?usp=sharing

I am still working on that to improve the final results. Meanwhile, I would be grateful to have your comments on that.

dimoibiehg commented 2 years ago

Thanks a lot for working on the tasks! Pranjal and Jenny will take a look.

Note to future applicants: Please use a different public dataset of your choice instead of the UCI adult dataset to work on this task.

Sorry, I saw your message very late! I have been working on my notebook since the last week. What should I do?

pranjal-awasthi commented 2 years ago

Hello @bhaktipriya ! I used the COMPAS Recidivism Racial Bias dataset to develop these tasks. Please let me know if it is fine or I should use another dataset. Anyway I share my notebook with the tasks developed for now.

@pranjal-awasthi please take a look at my notebook with the requested tasks for this project and let me know any suggestions for improvement: https://colab.research.google.com/drive/1BaFISQOc8abCIJ-elHWoO88iuQFDDa10

Thank you all!

@gaalocastillo : your colab looks good to me. Please formulate a preliminary project proposal with weekly tasks/deliverables and send it to me for review.

pranjal-awasthi commented 2 years ago

@dimoibiehg : your colab looks good to me. Please formulate a preliminary project proposal with weekly tasks/deliverables and send it to me for review.

gaalocastillo commented 2 years ago

Hello @bhaktipriya ! I used the COMPAS Recidivism Racial Bias dataset to develop these tasks. Please let me know if it is fine or I should use another dataset. Anyway I share my notebook with the tasks developed for now. @pranjal-awasthi please take a look at my notebook with the requested tasks for this project and let me know any suggestions for improvement: https://colab.research.google.com/drive/1BaFISQOc8abCIJ-elHWoO88iuQFDDa10 Thank you all!

@gaalocastillo : your colab looks good to me. Please formulate a preliminary project proposal with weekly tasks/deliverables and send it to me for review.

I just sent you via email a project proposal draft, please let me know any comments from your side. In the next days I will work on the suggestions Jenny just left as comments in the document. Thanks fot your feedback! @pranjal-awasthi

iabhishekdeshmukh commented 2 years ago

Hello @pranjal-awasthi and @bhaktipriya, I request you to please take a look at my notebook which has Starter Task 1 and 2. I have used Wine Quality Data Set.

Notebook: https://colab.research.google.com/drive/1NZPyxkFRw8eWmvIQ7F8kZGAMr0qbgwuj?usp=sharing

I am working further on reducing performance gap. I will be glad to know any suggestions or improvements. Thank you.

kishoreKunisetty commented 2 years ago

@pranjal-awasthi I sent you a mail regarding proposal draft with subject " GSoC Proposal draft for Title "Active sampling for ML Fairness" "please review it and give feedback on how can I improve.

Thank you.