Closed zyxue closed 2 years ago
Q1: Yup - although it should be the average over many baselines drawn from some set of samples $D$. Q2: Yeah I think $S\subseteq C$ should be $S\subseteq N\setminus {i}$. Q3: It should become $\beta_i(x_i^f- \mu_i)$, where $\mui=\sum{x_b\in D} x_i^b$. But I assume all the features are mean zero, which is in footnote 8: "Note that we assume features have zero mean in the following calculations."
Hope that helps!
"Note that we assume features have zero mean in the following calculations."
I see! I missed this. Thank you!
I'm looking at this derivation in Section 1.5.
Q1. Do I understand correctly that E_D[x|do(x_S)] means, for example, a sample from D is (x1,x2,x3), x_S = (y2), then x|do(x_S) means (x1, y2, x3).
Q2 Also is
C
the same asN\{i}
, which is the notation you used at the top?Q3, in the result, where does the x_i^b term go? I thought the derivation would be very similar to that in Section 1.4 (shown below)