cornelltech / snack

Stochastic Neighbor and Crowd Kernel (SNaCK) embeddings: Quick and dirty visualization of large-scale datasets via concept embeddings
Other
51 stars 12 forks source link

demo Stochastic Neighbor and Crowd Kernel (SNaCK) embedding

Binder

Quick and dirty visualization of large-scale datasets via concept embeddings

This code (and the companion paper) showcase our work on “SNaCK,” a low-dimensional concept embedding algorithm that combines human expertise with automatic machine similarity kernels. Both parts are complimentary: human insight can capture relationships that are not apparent from the object’s visual similarity and the machine can help relieve the human from having to exhaustively specify many constraints.

As input, our SNaCK algorithm takes two sources:

SNaCK then generates an embedding that satisfies both classes of constraints.

Usage

See http://nbviewer.ipython.org/github/cornelltech/snack/blob/master/Examples.ipynb for documentation on SNaCK's parameters and example usage.

Launch a notebook with SNaCK and Food-10k

Binder

Click the above button to launch an IPython Notebook with SNaCK. This environment has a copy of SNaCK and comes pre-loaded with a copy of the Food-10k dataset for you to explore.

This service is made possible by the folks at binder

Installation

The following platforms are supported:

Linux and Mac OS X: Install from Conda (Preferred)

Please use Conda. Your life will be easier.

Just run:

$ conda install -c https://conda.anaconda.org/gcr snack

If you insist on compiling from source, read on:

Linux: Install from source with Pip

Just run:

$ pip install snack

You need to install Python 2.7, Numpy, and Cython. You also need a working compiler, CBLAS, and the Python development headers, which are installable from your distribution's package manager.

To install SNaCK and its dependencies on a clean Ubuntu Trusty x64 system, run:

# sudo aptitude install \
  build-essential       \
  python-dev            \
  libblas3              \
  libblas-dev           \
  python-virtualenv
$ virtualenv venv; source venv/bin/activate
$ pip install numpy
$ pip install cython
$ pip install snack

OS X: Install from source with Pip and Homebrew

SNaCK uses OpenMP. This makes compilation tricky on Mac OS X.

If you are on Mac OS X, you must install the real "not-clang" version of gcc because it has OpenMP support. At the time of writing, clang does not support OpenMP, and Apple has unhelpfully symlinked clang to /usr/bin/gcc. This will not work.

Using Apple-provided GCC is NOT supported. If gcc-5 --version contains the string clang anywhere in its output, you do not have the correct version of gcc.

Using Apple-provided Python is NOT supported.

The recommended installation method on OS X is with Homebrew. The following has been tested on a clean Yosemite installation.

$ brew install gcc
$ brew install python
$ virtualenv venv; source venv/bin/activate
$ pip install numpy
$ pip install cython
$ pip install snack

You may need to edit setup.py and change GCC_VERSION to point to the correct version if you are not using /usr/local/bin/gcc-5.