biocore / qiime

Official QIIME 1 software repository. QIIME 2 (https://qiime2.org) has succeeded QIIME 1 as of January 2018.
GNU General Public License v2.0
286 stars 265 forks source link

new feature - manifold learning for dimensionality reduction #1530

Open jahschwa opened 10 years ago

jahschwa commented 10 years ago

QIIME currently only implements Principal Coordinate Analysis (PCoA) for dimensionality reduction and subsequent viewing with Emperor. While PCoA is a tried and true method within the metagenomic community, it is inherently linear and thus misses nonlinear relationships. This can be a significant shortcoming, as bacteria tend to have nonlinear relationships, as explored in [1].

Many nonlinear methods for dimensionality reduction have been developed, a subset of which are known as Manifold Learning techniques (example: Isomap was used in [1]). Several of these unsupervised techniques have already been implemented in the Scikit-Learn python package [2].

Adding them to QIIME would require either another dependancy or inclusion of relevant files from their github [3]. As well as some simple python scripts that I have already written as part of school work (these currently assume Scikit-Learn is already installed).

[1] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=6392684 [2] http://scikit-learn.org/stable/modules/manifold.html [3] https://github.com/scikit-learn/scikit-learn/tree/master/sklearn

rob-knight commented 10 years ago

I think this is a good idea in principle. We are adding scikit-learn as a dependency anyway. Our experiments with isomap and LLE a few years ago were not encouraging but datasets are now much larger and it is worth revisiting. Thanks!

Rob

On May 4, 2014, at 9:29 PM, "Joshua Haas" notifications@github.com<mailto:notifications@github.com> wrote:

QIIME currently only implements Principal Coordinate Analysis (PCoA) for dimensionality reduction and subsequent viewing with Emperor. While PCoA is a tried and true method within the metagenomic community, it is inherently linear and thus misses nonlinear relationships. This can be a significant shortcoming, as bacteria tend to have nonlinear relationships, as explored in [1].

Many nonlinear methods for dimensionality reduction have been developed, a subset of which are known as Manifold Learning techniques (example: Isomap was used in [1]). Several of these unsupervised techniques have already been implemented in the Scikit-Learn python package [2].

Adding them to QIIME would require either another dependancy or inclusion of relevant files from their github [3]. As well as some simple python scripts that I have already written as part of school work (these currently assume Scikit-Learn is already installed).

[1] http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=6392684 [2] http://scikit-learn.org/stable/modules/manifold.html [3] https://github.com/scikit-learn/scikit-learn/tree/master/sklearn

— Reply to this email directly or view it on GitHubhttps://github.com/biocore/qiime/issues/1530.