fani-lab / Adila

Fairness-Aware Team Formation
4 stars 2 forks source link

A Brief Summary on Propensity Score #30

Open Hamedloghmani opened 1 year ago

Hamedloghmani commented 1 year ago

First, we have to start with the application of propensity score and when it should be utilized. Propensity score analysis is a class of statistical methods developed for estimating treatment effects with nonexperimental or observational data. Specifically, propensity score analysis offers an approach to program evaluation when randomized trials are infeasible or unethical, or when researchers need to assess treatment effects from survey data, census data, administrative data, medical records data, or other types of data “collected through the observation of systems as they operate in normal practice without any interventions implemented by randomized assignment rules” (Rubin, 1997,p. 757). In the social and health sciences, researchers often face a fundamental task of drawing conditioned casual inferences from quasi-experimental studies. Analytical challenges in making causal inferences can be addressed by a variety of statistical methods, including a range of new approaches emerging in the field of propensity score analysis. As a new and rapidly growing class of evaluation methods, propensity score analysis is by no means conceived as the best alternative to randomized experiments. In empirical research, it is still unknown under what circumstances the approach appears to reduce selection bias and under what circumstances the conventional regression approach (i.e., use of statistical controls) remains adequate. However, it is also a consensus among prominent researchers that the propensity score approach has reached a mature level.

To define the propensity score, we introduce the following notation: let X=(X1,..,Xn) represent confounders that are measured prior to intervention initiation (referred as “baseline confounders” below), then X=(X1i,..,Xni) is a vector of the value of the n confounders for the ith subject. Let n represent the available interventions, with T=1 indicating the subject is in the treated group and T=0 meaning the subject in the control group. For the ith subject, the propensity score is the conditional probability of being in the treated group given their measured baseline confounders, p(Xi) = Prob( Ti = 1|Xi)

Intuitively, conditioning on the propensity score, each subject has the same chance of receiving treatment. Thus, propensity score is a tool to mimic randomization when randomization is not available. In other words, A propensity score is the probability of a unit (e.g., person, classroom, school) being assigned to a particular treatment given a set of observed covariates. Propensity scores are used to reduce selection bias by equating groups based on these covariates. Suppose that we have a binary treatment indicator Z, a response variable r, and background observed covariates X. The propensity score is defined as the conditional probability of treatment given background variables: e(x) = Prob( Z = 1| X=x)

Side Note:

References: [1] Guo, Shenyang, and Mark W. Fraser. Propensity score analysis: Statistical methods and applications. Vol. 11. SAGE publications, 2014. [2] Faries, Douglas, et al. Real world health care data analysis: causal methods and implementation using SAS. SAS Institute, 2020. [3] Van der Laan, Mark J., and Sherri Rose. Targeted learning: causal inference for observational and experimental data. Vol. 10. New York: Springer, 2011.

hosseinfani commented 1 year ago

@Hamedloghmani Thanks for the explanation, but I am not sure I could understand the propensity in its simplest intuitive way :( Did you?

Hamedloghmani commented 1 year ago

@hosseinfani Thank you so much for your comment. I thought about it a lot, and this is the most dense definition that I could come up with: Imagine we have a group of members and we are going to do a binary treatment experiment meaning that each member gets the control or treatment (1 indicates treatment). Then, we have a set of variables like X=(X1, … , Xn) that represent the variables that effect both dependent and independent variables in our experiment. We measured this set before intervention ( before the start of our trial). Then for a member like i, propensity score is the conditional probability of getting the treatment(instead of control) with prior of Xi ( it is the variables in the set X for ith member) : P(Ti=1 | Xi)

hosseinfani commented 1 year ago

@Hamedloghmani Still I am confused. Can you give me a real example?

Hamedloghmani commented 1 year ago

@hosseinfani Sure. Propensity is also referred to as the tendency or the likelihood of an event happening. In the case of our example in recommender systems, a higher propensity for items with commonly observed interactions means that this phenomenon is likely to happen more often than rare items. Our goal is to down-weight the commonly observed interactions while up-weighting the rare ones.