mmp2 / megaman

megaman: Manifold Learning for Millions of Points
http://mmp2.github.io/megaman/
BSD 2-Clause "Simplified" License
322 stars 68 forks source link

megaman: Manifold Learning for Millions of Points

Anaconda-Server Badge build status version status license

megaman is a scalable manifold learning package implemented in python. It has a front-end API designed to be familiar to scikit-learn but harnesses the C++ Fast Library for Approximate Nearest Neighbors (FLANN) and the Sparse Symmetric Positive Definite (SSPD) solver Locally Optimal Block Precodition Gradient (LOBPCG) method to scale manifold learning algorithms to large data sets. On a personal computer megaman can embed 1 million data points with hundreds of dimensions in 10 minutes. megaman is designed for researchers and as such caches intermediary steps and indices to allow for fast re-computation with new parameters.

Package documentation can be found at http://mmp2.github.io/megaman/

If you use our software please cite the following JMLR paper:

McQueen, Meila, VanderPlas, & Zhang, "Megaman: Scalable Manifold Learning in Python", Journal of Machine Learning Research, Vol 17 no. 14, 2016. http://jmlr.org/papers/v17/16-109.html

You can also find our arXiv paper at http://arxiv.org/abs/1603.02763

Examples

Installation and Examples in Google Colab

Below it's a tutorial to install megaman on Google Colab, through Conda environment.

It also provides tutorial of using megaman to build spectral embedding on uniform swiss roll dataset.

Installation with Conda

Due to the change of API, $ conda install -c conda-forge megaman is no longer supported. We are currently working on fixing the bug.

Please see the full install instructions below to build megaman from source.

Installation from source

To install megaman from source requires the following:

Optional requirements include

These requirements can be installed on Linux and MacOSX using the following conda command:

$ conda create -n manifold_env python=3.5 -y
# can also use python=2.7 or python=3.6

$ source activate manifold_env
$ conda install --channel=conda-forge -y pip nose coverage cython numpy scipy \
                                         scikit-learn pyflann pyamg h5py plotly

Clone this repository and cd into source repository

$ cd /tmp/
$ git clone https://github.com/mmp2/megaman.git
$ cd megaman

Finally, within the source repository, run this command to install the megaman package itself:

$ python setup.py install

Unit Tests

megaman uses nose for unit tests. With nose installed, type

$ make test

to run the unit tests. megaman is tested on Python versions 2.7, 3.4, and 3.5.

Authors

Other Contributors

Future Work

See this issues list for what we have planned for upcoming releases:

Future Work