jakevdp / wpca

Weighted Principal Component Analysis (PCA) in Python
BSD 3-Clause "New" or "Revised" License
147 stars 24 forks source link

comparison with FA? #3

Open agramfort opened 8 years ago

agramfort commented 8 years ago

hey @jakevdp

how does it compare with FactorAnalysis. FactorAnalysis is precisely to handle heteroscedastic noise on the features

jakevdp commented 8 years ago

They're different models though, right? Factor Analysis is explicitly looking for lower-dimensional latent variables, while PCA is just maximizing variance in components. I think the difference between PCA and Factor Analysis in the presence of heteroskedastic errors is the same as the difference in their absence.

agramfort commented 8 years ago

you can see PCA as a generative model with

x = Uz + n, with U in R^{p \times k} and n \sim N(0, \sigma^2 Id)

and the FA model:

x = Uz + n, with U in R^{p \times k} and n \sim N(0, \Sigma) where \Sigma = diag(\sigma_1, \sigma_p}

so FA is the same as PCA except that it assumes a feature specific noise level.

also if you look as how it's implemented in sklearn you'll see that it's basically SVDs computed iteratively.

jakevdp commented 8 years ago

Alright, then I'll change my answer. The similarities between PCA and FA with heteroskedastic noise are the same as the similarities between PCA and FA with homoskedastic noise :smile:

agramfort commented 8 years ago

:)

now you can add it in the benchmark :) :)