rust-ml / linfa

A Rust machine learning framework.
Apache License 2.0
3.74k stars 244 forks source link

LDA and PLDA #148

Open xd009642 opened 3 years ago

xd009642 commented 3 years ago

We have linear regression but from what I can see we don't have linear discriminant analysis which is the equivalent algorithm for classification. We even have the iris dataset which was created to demonstrate LDA. And then a popular extension to LDA to PLDA which I've attached a TDS link for

xd009642 commented 3 years ago

PLDA has some complexities it seems different people have interpreted the paper differently and made different assumptions on scaling and unit variance. In the speech community kaldi's is standard and performs differently in benchmarks compared to scikit learns interpretation... Modelling these differences and allowing users to configure to the PLDA implementation they want would be useful for reproducing papers in the speech community - which is one of our usages of the algorithm

bytesnake commented 3 years ago

In the context of generalized eigenvalue decomposition the LDA algorithm is a variant of PCA with orthogonal eigenvalues in different coordinates. I'm currently planing to add an implementation which has a spectrum of supervised/fisher's approach to discriminant analysis. You can find a reference here https://arxiv.org/pdf/1910.05437.pdf the biggest issue at the moment is to extend LOBPCG to the general eigenproblem setting and get that upstream to ndarray-linalg

bytesnake commented 3 years ago

can we move this issue to the #22 and add general probabilistic PCA as well?

xd009642 commented 3 years ago

If you think the PLDA is relevant as well sure, I'm not too familiar on probabilistic PCA so :shrug: there's also a short tutorial paper on the popular impl in speech https://arxiv.org/abs/1804.00403