cavalab / brush

An interpretable machine learning library
http://cavalab.org/brush/
GNU General Public License v3.0
2 stars 0 forks source link

(py)Brush v1.0: islands! #55

Closed gAldeia closed 3 months ago

gAldeia commented 4 months ago

(py)Brush v1.0: islands!


What's new

Essentially, this PR implements all the evolutionary steps in C++, conveniently wrapped into a scikit estimator. C++ implementation uses task flow to manage several islands into different threads.

There are several new classes, and I have restructured the source code.

I named the C++ implementation Brush, while the Python library is called Pybrush.

How is designed

The main entry point is what I call the Brush Engine, which will do all the work to get the job done. The Engine is configured through a struct called Parameters, which contains all hyper-parameters for the EA. The Dataset class handles every operation regarding the data (splitting train and test partitions, inferring data types, etc.). You can run the Engine using a pre-constructed Dataset instance, or you can conveniently call fit(X, y), and it will try to create the dataset for you using some configurations you may specify in the Parameters class.

Prototyping with Brush

There is still compatibility with DEAP to prototype different evolutionary algorithms. In fact, this compatibility is extended by creating binders to all classes from Brush: Evaluator, Selector, Population, Individual, etc. Brush implements hashes for the individual and fitness classes in C++, making it possible to use the DEAP toolbox. The old NSGA2 implemented using the DEAP API is still there, and in the future, I hope I can create a notebook explaining how to prototype with Brush.

Final remarks

While there is much work to do, I think it is already time to move all these significant changes to the master branch. We now have a fully functional evolutionary algorithm implementation in C++ with taskflow to handle parallelism. There is also a convenient wrapper for this implementation, and we are still compatible with DEAP. There are many TODOs written all over the place, and I intend to work on these in the next weeks.

gAldeia commented 4 months ago

PR

lacava commented 3 months ago

@gAldeia checking back on this merge

gAldeia commented 3 months ago

@lacava

I finished working on your comments. There is documentation for many of the new things I implemented in this PR, mostly done with the help of GitHub Copilot (I was exploring its capabilities, and it turns out it can write documentation—sometimes). I also cleaned up a lot of TODOs that I left in the code. These TODOs also help make it easier to implement the MABs, which I'll be working on now.

Some new things:

These additions were staged locally in my machine, and while I was cleaning some TODOs, I decided to include them in the PR as well.