hredestig / pcaMethods

Perform PCA on data with missing values in R
GNU General Public License v2.0
45 stars 10 forks source link

Why PPCA performs better than BPCA #16

Closed miaomiao6606 closed 3 years ago

miaomiao6606 commented 3 years ago

Hi, Thanks for reading this message. I am using both PPCA and BPCA with same parameters. The result shows PPCA performs slightly better than BPCA. However. When I check the articles of this two methods, it shows BPCA should be better because it uses a more reasonable posterior distribution. I also checked the metabolic data set (example dataset) it also shows PPCA is slightly better. I am very confused. Do you have any thoughts on why it happens?

hredestig commented 3 years ago

I believe this is very dataset dependent. BPCA has frequently been found to provide better missing value imputation than PPCA, but that does not hold for every dataset. Different algorithms being better at different datasets is pretty common and reason for why it is worth being familiar with several of them.