rust-ml / linfa

A Rust machine learning framework.
Apache License 2.0
3.76k stars 249 forks source link

Improvements for Principal Component Analysis #22

Open bytesnake opened 4 years ago

bytesnake commented 4 years ago

A plain Principal Component Analysis algorithm was added in https://github.com/rust-ml/linfa/commit/7b6075e2dc9cc1c56ad7cd956bf996d69ce51d20. The next steps should improve upon edge-cases and features.

bytesnake commented 3 years ago

sparse PCA depends on #46

sjaustirni commented 3 years ago

It seems like some Sparse/Robust PCA tests use Yale face dataset. It might be too big for linfa-datasets though (as it is a lot of image data), but I think it's way too nice of a "real-world" example to let this one slip away. Maybe we could have a separate repository for this test?

I am not sure about licensing though, the page I linked above does not mention it and the link they give to the original dataset seems broken.

EDIT: Another page about the original dataset says:

NOTE: You are free to use the Yale Face Database B for research purposes. If experimental results are obtained that use images from within the database, all publications of these results should acknowledge the use of the "Yale Face Database B" and reference this paper. Without permission from Yale University, images from within the database cannot be incorporated into a larger database which is then publicly distributed.

bytesnake commented 3 years ago

Perhaps, you can take a look at the mnist crate and implement something similar for the face dataset? There is also an open issue here to replace the downloader: https://github.com/davidMcneil/mnist/pull/8