Closed lindahua closed 10 years ago
Need to read this again, but all of it sounds reasonable on a quick read.
I tagged the current version as v0.1.0, and new version developed under the new plan (which probably has breaking changes) will be tagged v0.2.0.
I would say this would need additional features such as
Varimax rotation
Most of these have been implemented in MultivariateStats.jl
This package is an important foundation for our efforts towards a machine learning ecosystem. I plan to spend some time to work on this package recently.
Here is a tentative plan:
[ ] Remove dependence on DataFrames
Particularly, this package should provide core dimensionality reduction algorithms that focus on ordinary arrays. Dependence on data frames should be removed.
[ ] Consistent interface
To conform with other machine learning packages, this package should use column-major data set format (i.e. each column being an observation).
[ ] Improve PCA
Implement multiple PCA algorithm (e.g. based on covariance, SVD, transposed data, etc), and a high-level
pca
function that selects an appropriate method based on input data.[ ] Independent Component Analysis
It seems that a Fast ICA algorithm has been implemented. But will add testing codes.
[ ] Classic MDS
It is already here. May need some testing.
[ ] Separate NMF to another package.
Nonnegative Matrix Factorization in itself is a big field that deserves its own package. I have created a package NMF.jl for this purpose, and implemented a more sophisticated framework there.
These are the basics. I believe we can release v0.1.0 when these are ready.
cc: @johnmyleswhite