hlin117 / mdlp-discretization

An implementation of the minimum description length principal expert binning algorithm by Usama Fayyad
BSD 3-Clause "New" or "Revised" License
103 stars 54 forks source link

Minimum Description Length Binning

This is an implementation of Usama Fayyad's entropy based expert binning method.

Please read the original paper here for more information.

Installation and Usage

Install using pip

pip install git+https://github.com/hlin117/mdlp-discretization

As with all python packages, it is recommended to create a virtual environment when using this project.

Example

>>> from mdlp.discretization import MDLP
>>> from sklearn.datasets import load_iris
>>> transformer = MDLP()
>>> iris = load_iris()
>>> X, y = iris.data, iris.target
>>> X_disc = transformer.fit_transform(X, y)

Tests

To run the unit tests, clone the repo and install in development mode

git clone https://github.com/hlin117/mdlp-discretization
cd mdlp-discretization
pip install -e .

then run tests with py.test

py.test tests

Development

To submit changes to this project, make sure that you have Cython installed and submit the compiled *.cpp file along with changes to python code after running installation locally.