Propose diff formulation for marginal prob

nicolasshu commented 3 years ago

@RaviSoji I just sent you an email representing the computations on what I am proposing the computations for the marginal probability should be. We can't do LaTeX here, but I'll put the scripting below. And this pull request is the proposed fix to that calculation. Let me know what your thoughts are =]

(I also switched log_exponent_1 with log_exponent_2 to match the order shown in Ioffe 2006)

Current:

\begin{align}
P(\boldsymbol{u}^{1\cdots n}) &= \prod_{t=1}^d C \exp (E_1 + E_2) \\ 
\log (C) &= -\frac{n}{2} \log(2\pi) - \frac{1}{2} \log(n\Sigma^{prior} + I) \\
\log (e^{E_1}) &= E_1 = \frac{n^2 \Sigma^{prior} \bar{\boldsymbol{u}}^2}{2 (n\Sigma^{prior} + I)} \\
\log (e^{E_2}) &= E_2 = -\frac{1}{2} \sum_{\boldsymbol{u} \in U_{model}} \boldsymbol{u}^2 \\
\log(P(\boldsymbol{u}^{1\cdots n})) &= \sum_{t=1}^d \log(C) + E_1 + E_2 \\ 
\end{align}

Proposed:

\begin{align}
P(\boldsymbol{u}^{1\cdots n}) = P(\boldsymbol{u}^1, \boldsymbol{u}^2, \cdots, \boldsymbol{u}^n) &= \prod_{t=1}^d \underbrace{\frac{1}{\sqrt{(2\pi)^n (\psi_t + \frac{1}{n})}} }_{C} \exp \left( \underbrace{- \frac{\bar{u}_t^2}{2(\psi_t + \frac{1}{n})}}_{E_1} \underbrace{- \frac{\sum_{i=1}^n (u_t^i - \bar{u}_t)^2}{2} }_{E_2} \right) \\
&= \prod_{t=1}^d C \exp (E_1 + E_2) \\ 
\log (P(\boldsymbol{u}^{1\cdots n})) &= \sum_{t=1}^d \color{red}{\log (C)} +\color{cyan}{ \log(e^{E_1})} + \color{magenta}{\log(e^{E_2})} \\
&= \sum_{t=1}^d \color{red}{\log (C)} + \color{cyan}{E_1} + \color{magenta}{E_2} \\
&= \sum_{t=1}^d \color{red}{-\frac{n}{2} \log (2\pi) - \frac{1}{2} \log \left(\psi_t + \frac{1}{n} \right)} \color{cyan}{ - \frac{\bar{u}_t^2}{2(\psi_t + \frac{1}{n})} } \color{magenta}{- \frac{\sum_{i=1}^n (u_t^i - \bar{u}_t)^2}{2} } \\
&= \sum_{t=1}^d \color{red}{-\frac{n}{2} \log (2\pi) - \frac{1}{2} \log \left(n\psi_t + 1 \right) + \frac{1}{2} \log(n)} \color{cyan}{ - \frac{n \bar{u}_t^2}{2(n\psi_t + 1)} } \color{magenta}{- \frac{\sum_{i=1}^n (u_t^i - \bar{u}_t)^2}{2} }
\end{align}

RaviSoji commented 3 years ago

Hey, thanks a ton for putting so much time into checking this! I wrote back a few days ago, but in case it didn't reach you, I've summarized below.

Did you check the marginal likelihood equation in the paper by comparing it to equation (42) in Kevin Murphy's derivations: https://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf? Kevin Murphy's equation is for the 1D case, but since the covariance matrices in the Ioffe paper are all diagonal, I believe you get independence, allowing you to apply Murphy's equation to each dimension separately and take the product.

I think when I was writing this code, I found that marginal likelihood equation in the paper wasn't correct. That is, when I derived the posterior, posterior predictive, and marginal likelihood equations, I found that mine matched Kevin Murphy's. Moreover, after plugging in data, I found that the equations actually gave different results.

That said, it was a long time ago, so I could be misremembering or I could have done something incorrectly. I will try to check the equations algebraically and computationally as soon as I get the chance.

Thanks again! Ravi B. Sojitra

nicolasshu commented 3 years ago

Hey! Sorry I didn't get back to you! I saw your email, but had to deal with some other work, but I've been double checking all the math again. I think you have a point, but I am still in the middle of it. I didn't want to respond with a flaky answer. Do you mind giving me a few more days for my own sanity? After your email, I started getting all skeptical about Ioffe's paper, so I started looking at everything with a new eye.

RaviSoji commented 3 years ago

Oh no! I didn't mean to make you doubt the whole paper, haha. In any case, take your time. Let me know if it'd help to hop on a Zoom call or something at some point.

RaviSoji commented 3 years ago

Closing! Based on your e-mail a while back, it sounds like you also think there is a typo in the original paper. Happy to reopen if you end up concluding otherwise.

Thanks again! Ravi B. Sojitra

RaviSoji / plda

Propose diff formulation for marginal prob #54