scikit-learn-contrib / sklearn-ann

Integration with (approximate) nearest neighbors libraries for scikit-learn + clustering based on with kNN-graphs.
https://sklearn-ann.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
15 stars 6 forks source link
approximate-nearest-neighbor-search clustering knn knn-graphs scikit-learn

.. -- mode: rst --

|PyPI| |ReadTheDocs|

.. |PyPI| image:: https://img.shields.io/pypi/v/sklearn-ann .. _PyPI: https://pypi.org/project/sklearn-ann/

.. |ReadTheDocs| image:: https://readthedocs.org/projects/sklearn-ann/badge/?version=latest .. _ReadTheDocs: https://sklearn-ann.readthedocs.io/en/latest/?badge=latest

sklearn-ann

.. inclusion-marker-do-not-remove

sklearn-ann eases integration of approximate nearest neighbours libraries such as annoy, nmslib and faiss into your sklearn pipelines. It consists of:

Installation

To install the latest release from PyPI, run:

.. code-block:: bash

pip install sklearn-ann

To install the latest development version from GitHub, run:

.. code-block:: bash

pip install git+https://github.com/scikit-learn-contrib/sklearn-ann.git#egg=sklearn-ann

Why? When do I want this?

The main scenarios in which this is needed is for performing clustering or manifold learning or high dimensional data. The reason is that currently the only neighbourhood algorithms which are build into scikit-learn are essentially the standard tree approaches to space partitioning: the ball tree and the K-D tree. These do not perform competitively in high dimensional spaces.

Development

This project is managed using Hatch and pre-commit. To get started, run pre-commit install and hatch env create. Run all commands using hatch run python <command> which will ensure the environment is kept up to date. pre-commit_ comes into play on every git commit after installation.

Consult pyproject.toml for which dependency groups and extras exist, and the Hatch help or user guide for more info on what they are.

.. _Hatch: https://hatch.pypa.io/ .. _pre-commit: https://pre-commit.com/