4/14: Nathan Srebro - Githubissues

ehuppert commented 2 years ago

Comment below with a well-developed question or comment about the reading for this week's workshop. These are individual questions and comments.

Please post your question by Wednesday 11:59 PM, and upvote at least three of your peers' comments on Thursday prior to the workshop. You need to use 'thumbs-up' for your reactions to count towards 'top comments,' but you can use other emojis on top of the thumbs up.

LynetteDang commented 2 years ago

Hi Professor Srebro,

Thank you for the sharing your paper with us. I have been following criminal justice research for years since Northpointe’s release of COMPAS caused controversy and ProPublica authored their critique. I am thrilled to read about a potential practical way to achieve protection of personal sensitive attribute to hopefully eliminate the biases in risk-assessment-driven bail algorithm (and also applications in many other fields). This is amazing!

A few questions:

You have mentioned equal opportunity being “a weaker, though still interesting, notion of non-discrimination”. I am wondering why it is comparatively weaker than equalized odds, isn’t equalized odds and equal opportunity the same notion except for equalized odds, y can take either 0 or 1, while for equal opportunity, y can only take 1? Let me know if I have misunderstood anything.
I got lost roughly where the convex hull enter the picture on page 4, after you defined \gamma_a(\tilde{Yp}). I have no problem with rewriting it into a linear program with constraints, but I didn’t quite get how you arrive at claim 3.2, it will be nice if you can walk through some details during the workshop!

Thank you so much, and I am really looking forward to your talk!

--Lynette

JadeBenson commented 2 years ago

Thank you so much for this interesting research! I think ensuring fairness in machine learning is one of the most important and exciting directions of the field right now. You seem to be proposing an excellent solution where we are not only refines the definition of fairness, but applies it while improving accuracy. Now I just want to know what the next steps are! Publicizing this work and presenting are great steps. I think this should spread even more widely - do you plan to create an sklearn package implementing these algorithms and publicizing their use? Do you want to work with different industries/institutions to help change their current practices? Looking forward to hearing about it!

Sirius2713 commented 2 years ago

Thanks a lot for sharing your research with us, Professor Srebro! It's fascinating to see how fairness can be promoted in machine learning. My question is how your methods balance horizontal and vertical equity. How can your methods balance between treating everyone equally and treating everyone differently based on their background?

Yaweili19 commented 2 years ago

Thanks a lot for sharing this interesting research! This topic can be quite sensitive yet it doesn't receive as much attention as it should. And I've seen more than a few usages of "discriminative data/algorithm/prediction" practices.

However, I'm having difficulties understanding your definition of a non-discriminative predictor, even if I read it back and forth a couple of times. Would you mind further explaining how A is being protected in this definition? More examples would be great!

Besides, I noticed that it has been some time since this article was published. Have your team made any available statistical packages to estimate the indiscriminative predictor? I don't very much feel confident deriving them myself in my research based on this paper alone.

Raychanan commented 2 years ago

Hi Professor Srebro, I am struggling with how to connect the outstanding work done in this paper with others' work on "de-biasing" deep models. Would it be fair to interpret your work as eliminating biases "before" the real training takes place, while debiasing work is carried out in deep models "after" training (right before deployment)?

Thanks so much again for presenting your work!

erweinstein commented 2 years ago

In your conclusion you summarize (accurately, in my opinion) that your method has a major advantage in that someone could implement it as a simple post-processing step rather than (as you note in the longer version on arXiv) needing to change their machine learning training pipeline or needing to perform some transformation on the raw data (your method only requires aggregate information about the data). I concur with my classmates in wanting to see what the "next steps" are to get researchers and institutions to start using this new method! Also in that same conclusion section, you add that it would also be easy to add a Differential Privacy mechanism to your method if the situation requires it. This makes sense to me and is also an important advantage, but as we know, there are many Differential Privacy mechanisms that seek to achieve multiple different types/notions of privacy. Just as you make a strong case that some non-discrimination approaches are better than others, which types of privacy do you think we should prefer in this same use-case of credit scores? (Obviously that is an entire area of research, which you are aware of since you cite the work of Dwork and her colleagues, but since credit scores are--perhaps undesirably from some points of view--extremely central to much of our economic life, I wanted to see if you have additional thoughts to extend this example even further.)

Qiuyu-Li commented 2 years ago

Thank you so much for this interesting research! It's exciting to observe how machine learning can encourage fairness. I guess the biggest question from me is just what do you think would be the next step. My imagination is that you are implying a future of interdisciplinary corporations, and this future is definitely worth more attention!

PAHADRIANUS commented 2 years ago

Hi, thank you for sharing the work providing a bias-removing computational framework in our generic socio-economic researches. My question is that whether the predicator can be generalized to all kinds of various sources of bias other than gender or race. Bias and discrimination ultimately originate from placing stereotypes and signals on specific groups of people, and thus even more trivial matters such as education level, home region or even age could cause discrimination. It possible to make your method multi-dimensional and rule out many biases at once?

bowen-w-zheng commented 2 years ago

Hi Prosessor Srebro,

Thank you for presenting your work. I have two questions: (1) As you mentioned in the introduction, there are many different notions of fairness. Can we show that this particular notion of fairness (i.e. equal odds/opportunities) can be compatible with other notions of fairness? Alternatively, are there any results showing various notions of algorithmic fairness can not coexist? (2) If I am understaning correctly, the results in the paper do not need to assume independency of A and Y. I struggle to understand how to make sense of equal odds in the situation where A and Y are not indepdent. Can you give us some examples where this notion of fariness might not be relevant/appropriate to better contexualize the defintion?

taizeyu commented 2 years ago

Dear Dr Srebro. Thank you very much for sharing this research to us. It is an excellent job. In the paper, you stated the shortcoming of the research "different dependency structures with possibly different intuitive notions of fairness cannot be separated based on any oblivious notion or test.". I want to know what you will do in the next step to fix this problem.

FranciscoRomaldoMendes commented 2 years ago

I agree with Bowen, could you elaborate on how these various notions of fairness correlate with each other? And are they necessarily compatible?

Coco-Jiachen-Yu commented 2 years ago

Hi Professor Srebro, thank you so much for sharing with us your research. This is a very interesting topic. I have a few questions regarding your paper, mainly regarding the implications and applications of your research approach:

Do you expect that definition 2.1 and 2.2 can be used to detect implicit human bias in some fields (e.g., college application)? To be more specific, how can this implementation help facilitate human decisions, beyond reducing bias in artificial intelligence?
The algorithms seem to require pre-defined protected attributes such as race. Is it capable of addressing some attributes we are potentially biased about but usually neglected?

JunoWuu commented 2 years ago

Dear Dr. Srebro. Thank you for sharing your research! I think this topic is really interesting. I know that the FICO score can be a really good application of this tool. However, I wonder if there is other application of this supervised learning model that are less obvious and that we do not usually know of. It is just really interesting to learn more real life application of this technique.

TwoCentimetre commented 2 years ago

Professor you mentioned that statistical prediction is inherently a discrimination task. And it discriminates between cases we predict will be in one target class versus the other. But why you also say, 'we frequently do not want to employ certain kinds of discrimination based on persectibed "protected attributes"'? I do not understand this statement and hope that you can elaborate this point. And I do not understand what kind of discrimination is preferred when conducting researches? Thanks

fyzh-git commented 2 years ago

Thank you professor for introducing the non-discriminatory method to supervised learning for the inherent discrimination generated from the classification procedure. Would you mind elaborating more about the procedure of shifting burden of uncertainty in classification from the protected class to the decision maker in the presentation? Thanks!

yiq029 commented 2 years ago

Dear professor Srebro. Thank you for sharing your research. I am looking forward to learn how you deal with discrimination in statistics!

xin2006 commented 2 years ago

Thank you professor for sharing your work! In the paper, you propose a framework to remove the discrimination in supervised learning. And inspired by the introduction, considering the fact that algorithms is designed by human, which could involve some discrimination in people's mind, and also the situation that big data consists of much personal information, which might introduce new discrimination, I am really curious whether and to what degree the big data and algorithms will perpetuate the existing discrimination or even introduce new chance for discrimination?

zhiqianc commented 2 years ago

Hi Professor, thanks for sharing! I have several questions for the reading this week. First I would like to know what redundant coding means and why its existence will make fairness ineffective. Also, will it happen that in some areas ensuring fairness may sacrifice the predicting ability? I assume that in some cases, those protected attributes do have some hidden but important information that directly affects the learning model. What methods else can we utilize to balance this trade-off?

yjhuang99 commented 2 years ago

The notion of equalized odds predictors is novel, and it would be great if it can be generalized to encourage the construction of predictors that are accurate in all groups. One of my questions is could you give some other examples/situations where the proposed method can be applied? Another is how do you define more accurate fairness and deal with the tradeoff among accuracy, predictability and interpretability? I hope to learn more about the case study to go through the details and the intuition behind your methods in the presentation tomorrow!

BaotongZh commented 2 years ago

Professor Srebro,

Thank you so much for bringing us such a great work. I was just wondering how would you explain the mechanism of producing the discrimination embedded in the algorithm, How do we determine that the biases are from the sensitive features? And why we consider such biases as a kind of discrimination we defined in social science.

Lynx-jr commented 2 years ago

Hi Dr. Srebro. Thank you for sharing your research! My question is the same as several people mentioned: 1) how do you determine "protected attributes"? 2) when different values of fairness collide, how to balance between them using your study?

YLHan97 commented 2 years ago

Hi Professor Srebro,

Thanks for sharing your research with us. I have a question as follows:

In the article, you have mentioned that you proposed a criterion for discrimination against a specified sensitive attribute in order to forecast some target based on available features. Since I’m really interested in the prediction model in financial area, but not so much familiar with your research area, would you please provide more real world examples in the relevant business industry or in your area?

ShiyangLai commented 2 years ago

Dear Prof. Srebo, thank you for your sharing. In the sense of statistics, your work is quite convincing for me. But I do want to know more application scenarios. Wish you can provide more implementation suggestions, especially in the social sciences domain, in tomorrow's presentation. Looking forward to your speech!

yutaili commented 2 years ago

Thanks for sharing your work Professor Srebro. I can see that your research addresses the critical problem, especially in the discipline of criminal justice, that some time ML algorithm gives a discriminative prediction based on certain attributes, which may place some group of people in a disadvantaged position. While I was reading your paper, I didn't follow your definition about the non-discriminating predictor (derived predictor), and I'd like to hear more about about that. Thanks.

tangn121 commented 2 years ago

Thank you Professor Srebro for being aware of bias in machine learning and generating a great way to achieve both fairness and accuracy. I am wondering what would be the cost of algorithmic fairness and how you would like to balance the trade-off. Thank you!

hazelchc commented 2 years ago

Hi Prof. Srebo, thank you very much for sharing an interesting piece of work with us! Your work is important to the development of supervised learning. I am just curious about how other social science disciplines might benefit from your framework? Thanks!

Dxu1 commented 2 years ago

Thank you for sharing your interesting work Prof. Srebro. I find the work exciting and the topic on algorithmic fairness crucial. I am curious about your take on the balance between accuracy and fairness, especially when the set of features might be limited. Depending on the fairness, one may also have many variables to characterize fairness (in your fico score example, there is only one variable of race). How would an increase in number of fairness variables impact prediction accuracy?

yujing-syj commented 2 years ago

Hi Professor Srebro, thanks for coming to our workshop! The paper gives us very detailed analysis about the balance between profit and equality consideration when building the model. I am wondering what's the most common choice of the optimal profit-maximizing classifier that is used in the real life for many companies when doing the prediction.

hhx2207061197 commented 2 years ago

Hi Professor. In your paper, you point out the shortcoming of your study "cannot separate different dependency structures with potentially different intuitive notions of fairness based on any casual concept or test.". I would like to know the follow-ups. Thanks!

LuZhang0128 commented 2 years ago

Hi Professor Srebro, thank you so much for the research paper you shared. I'm really interested in the convex optimization part and a little confused as well. I would appreciate it if you can share more about this part in the workshop. I also wonder if there's any way to achieve the same goal without sacrificing convexity? Thanks!

nijingwen commented 2 years ago

hi, professor Srebo, thank you very much for sharing. The discrimination is an important social science problem. I am super interested in how to build a model to do the discrimination prediction. I am interested in the detail of model. I am looking forward to hearing from you tomorrow. Thanks

XinSu6 commented 2 years ago

Thank you so much for sharing your work with us!

First I am wondering what is the specific meaning of a non-discriminative predictor? That is a concept I am kind of confused about and there are not many explanations on that in the paper. Also, the conclusion part mentioned that your methodology has the advantage that someone can easily implement it as a simple post processing step and do not need to change the existing machine learning training pipeline or perform any transformation on the raw data. Can you elaborate more specifically on that how it can be applied in other fields of studies? Have there been any successful examples?

Looking forward to your talk.

fiofiofiona commented 2 years ago

Hi Prof. Srebro, thank you for sharing your work! It is interesting to see that in this approach, only the main source of bias (or the protected attribute) is included in the prediction, but not all other attributes that may interact or balance the effect of race in biased decisions. I am curious, however, in what context would we care about equal opportunity instead of equalized odds, given that equalized odds take both true positives and false positives into account? Another question I have is about other implications of this model. In the loan example, a real-valued score (FICO) is accessible to derive the threshold of prediction; but in cases that no exact score is available, college admission for example, how can we adjust and apply this model to achieve fairness?

yhchou0904 commented 2 years ago

Hi Prof. Srebo! Thank you for sharing your idea with us. The paper describes how to adjust learned predictors to reduce discrimination from the definition. I am still confused about some intuitive explanations of how to derive these predictors. Also, I would like to know whether there is a limitation to the predictor we've derived?

awaidyasin commented 2 years ago

Thanks for sharing your work, professor.

Your paper was an excellent read given how rife ML methods are across decision-making places. I was wondering if your method implicitly implements affirmative action in some sense? For instance, you mention that despite having a binary target variable, many applications used a real-valued score that may not “already satisfy equalized odds”, in which case the researcher has to use different thresholds for different protected groups. Apart from being a statistical necessity, can these thresholds be understood as varying minimum requirements?
The results in Figure 4 are extremely intriguing. I understand that each of these thresholds does come from a particular statistical criterion, however, how likely is one to be adopted by policymakers over the others? Except for single threshold, every criterion seems to give a FICO threshold that is correlated with the protected attribute in the end. In other words, how fair do these criteria ‘feel’ as compared to single threshold?

y8script commented 2 years ago

Hello Prof. Srebro, I think your research on fairness in modeling is meaningful and important. I'm wondering whether it's possible to add a post hoc component to the model building progress to validate the fairness or utility of the model by the actual results? And is it possible to constantly update the model with frequent feedback on real-life outcomes?

AlexPrizzy commented 2 years ago

The issue of discrimination in machine learning is important to the wide spread practical use of such methods. The naive approach of simply excluding discriminatory features also fails to account for multicollinearity of these features with those features which are not explicitly discriminatory. Such as excluding disability status, though including credit history which may be confounded by disability status. What methods exist to reduce discriminatory noise in implicitly discriminatory features?

cgyhumble0612 commented 2 years ago

Hi professor, thank you for sharing us your great research. I like the figures in this paper, which help me better understand this research. I have two questions in practical fields at all. First, how much extent do you think we should depend on the data to accomplish fairness. Secondly, does it need much computing resource to accomplish that?

ZHE-ZHANG-0213 commented 2 years ago

Hello, Professor Srebro, thank you very much for sharing your research paper. I'm interested and a little confused in the convex optimization part. I would appreciate it if you could share more information on this at the seminar. Interestingly, in this approach, only the primary sources of bias (or protected attributes) are included in the predictions, not all the other attributes that might interact or balance the effects of race in biased decisions. I wonder, however, at what point do we care about equal opportunity as opposed to equal opportunity, because equal opportunity takes into account both true and false positives?

Emily-fyeh commented 2 years ago

Hi Prof. Srebro, I would like to know more about the interpretation of your approach and if it is possible to later be utilized in some more controversial field such as court jurisdiction or medical service? Since the implicit race discrimination in this field could be even harder to detect and fix.

edelahayeUChicago commented 2 years ago

Hi Professor,

I'm interested in the extent to which your approach could be generalised to wider definitions than the two you put forth in the paper. Given the highly philosophical and contested concepts of what it is for individuals to have equality, I'd be interested if a general routine could be programmed to fit optimal classifications to a wider class of "social welfare" functions with adjusted emphasis on different elements of equality and what it means to be non-discriminatory?

siruizhou commented 2 years ago

Thank you for sharing this work Prof. Srebro. This innovative idea of using threshold predictors to deal with the fairness issue is very straightforward and efficient. I'm looking forward to hearing the detailed explanations. I wonder how much domain knowledge is required for implementing such methods when a score like FICO is not available and how does that impact its fairness.

mintaow commented 2 years ago

Hi Professor, thanks for sharing your work with us! I am particularly curious about the implementation cost/blockers of this fairness measure in the industry. You mentioned that the equalized odds would encourage developers to construct predictors that are accurate in all groups and more directly related to the outcome variables. I hypothesize this might not be that easy as it may take quite an amount of labor work to collect data and excavate for features. So I am just wondering the attitudes of large tech firms such as Google :)

ginxzheng commented 2 years ago

Thank you for coming Professor Srebro! I really like the FICO score case study. Would you elaborate more on how did you select/come up with the five constraints? Thank you!

sushanz commented 2 years ago

Thank you for sharing Professor Srebro. The loss function you implemented to take a pair of labels then returning a real number sounds really interesting as it indicating the correct labels. I am also curious of how the loss function optimized the binary predictor in your research. Are there any distinct influences comparing it to the Bayes optimal predictors you mentioned in the following sections? Looking forward to hearing from your speech tomorrow!

FrederickZhengHe commented 2 years ago

Hi Professor Srebro! Very interested in your paper, but might it be possible for you to explain the two concepts of equalized odds and equalized opportunity? Look forward to listening to your insights tomorrow!

zbchen0129 commented 2 years ago

Hi Prof. Srebro. Thanks for your sharing. I am impressed by the simple and interpretable notion of nondiscrimination with respect to a specified protected attribute. I still have some questions about it. 1) Are equalized odds and equal opportunity sufficient to measure non-discrimination? 2) Since you only use one specified protected attribute, how do you determine that this attribute is relevant and important? What if the predictor is biased and discriminative on other attributes? Thank you!

qishenfu1 commented 2 years ago

Hi Prof. Srebro, look forward to tomorrow's task! I found that your work is very much into pure statistics and econometrics stuff. I am curious whether there are any interesting social sciences applications to your research? Thanks!

kuitaiw commented 2 years ago

Dear Professor Srebro, thank you for sharing your work. I am more concerned with the applicability of the experiment. So I wonder if it's possible to amplify the bias. It's not just limited to gender or race. I wonder if it's possible to enrich predictors? Such as education level or age.

xzmerry commented 2 years ago

Thank you Professor Srebro for sharing your interesting research! It is amazing to see how you address the possible discrimination of algorithms in supervised learning. Would like to hear more about the details in the workshop!

Besides, it seems that in this non-discriminal algorithm, the set of protected attributes A is given. Hence, there seems to be an underlying assumption that the protected attributes (and the attributes that could be discriminated against) are known. But sometimes, in practice, maybe A is partly unknown (such as there could be an unrecognized mechanism). How to deal with it when A could not be fully understood/captured?

uchicago-computation-workshop / Spring2022

4/14: Nathan Srebro #3