cavalab / brush

An interpretable machine learning library
http://cavalab.org/brush/
GNU General Public License v3.0
2 stars 0 forks source link

Brush

Brush is an interpretable machine learning library for training symbolic models. It wraps multiple learning paradigms (gradient descent, decision trees, symbolic regression) into a strongly-typed genetic programming language (Montana, 1995 PDF).

This project is very much under active development. Expect api changes and broken things.

For the user guide and API, see the docs.

Features / Design Goals

Contact

Brush is maintained by William La Cava (@lacava, william.lacava@childrens.harvard.edu) and initially authored by him and Joseph D. Romano (@JDRomano2).

Special thanks to these contributors:

Acknowledgments

Brush is being developed to improve clinical diagnostics in the Cava Lab at Harvard Medical School. This work is partially funded by grant R00-LM012926 from the National Library of Medicine and a Patient-Centered Outcomes Research Institute (PCORI) Award (ME-2020C1D-19393).

License

GNU GPLv3, see LICENSE

Quickstart

Installation

Installation via Python wheel and pip (recommended)

Important: This method is only currently supported for CPython v3.11 running on the Linux x86_64 platform. Other Python versions and operating systems will be supported in the near future.

To install a prebuilt version of pybrush, download the most recent release of the wheel file on the Releases page (e.g., pybrush-0.1.1-cp311-linux_x86_64.whl; you may need to expand "Assets" to see the file). Then, navigate to the directory containing the wheel file and install it using pip:

pip install pybrush-0.1.1-cp311-linux_x86_64.whl

Manual installation

Clone the repo:

git clone https://github.com/cavalab/brush.git

Install the brush environment:

cd brush
conda env create

Install brush:

pip install .

from the repo root directory. If you are just planning to develop, see Development.

Basic Usage

Brush is designed to be used similarly to any sklearn-style estimator. That means it should be compatible with sklearn pipelines, wrappers, and so forth.

In addition, Brush provides functionality that allows you to feed in more complicated data types than just matrices of floating point values.

Regression

# load data
import pandas as pd

df = pd.read_csv('docs/examples/datasets/d_enc.csv')
X = df.drop(columns='label')
y = df['label']

# import and make a regressor
from pybrush import BrushRegressor

# you can set verbosity=1 to see the progress bar
est = BrushRegressor(verbosity=1)

# use like you would a sklearn regressor
est.fit(X,y)
y_pred = est.predict(X)

print('score:', est.score(X,y))

Classification

# load data
import pandas as pd

df = pd.read_csv('docs/examples/datasets/d_analcatdata_aids.csv')
X = df.drop(columns='target')
y = df['target']

# import and make a classifier
from pybrush import BrushClassifier
est = BrushClassifier(verbosity=1)

# use like you would a sklearn classifier
est.fit(X,y)

y_pred = est.predict(X)
y_pred_proba = est.predict_proba(X)

print('score:', est.score(X,y))

Contributing

Please follow the Github flow guidelines for contributing to this project.

In general, this is the approach:

Development

python setup.py develop

Gives you an editable install for messing with Python code in the project. (Any underyling cpp changes require this command to be re-run).

Package Structure

There are a few different moving parts that can be built in this project:

Pip will install the brush module and call CMake to build the _brush extension.
It will not build the docs or cpp tests.

Tests

Python

The tests are run by calling pytest from the root directory.

pytest 

Cpp

If you are developing the cpp code and want to build the cpp tests, run the following:

./configure
./install tests

Building the docs locally

To build the documentation you will need some additional requirements. Before proceeding, make sure you have the python wrapper installed, as the documentation have some sample notebooks that will run the code.

First go to the docs folder:

cd docs/

Then, install additional python packages in the same environemnt as brush is intalled with:

conda activate brush
pip install -r requirements.txt

Now just run:

make html

The static website is located in -build/html