Closed gAldeia closed 5 months ago
@gAldeia checking back on this merge
@lacava
I finished working on your comments. There is documentation for many of the new things I implemented in this PR, mostly done with the help of GitHub Copilot (I was exploring its capabilities, and it turns out it can write documentation—sometimes). I also cleaned up a lot of TODOs that I left in the code. These TODOs also help make it easier to implement the MABs, which I'll be working on now.
Some new things:
These additions were staged locally in my machine, and while I was cleaning some TODOs, I decided to include them in the PR as well.
(py)Brush v1.0: islands!
What's new
Essentially, this PR implements all the evolutionary steps in C++, conveniently wrapped into a scikit estimator. C++ implementation uses task flow to manage several islands into different threads.
There are several new classes, and I have restructured the source code.
I named the C++ implementation Brush, while the Python library is called Pybrush.
How is designed
The main entry point is what I call the Brush
Engine
, which will do all the work to get the job done. The Engine is configured through a struct calledParameters
, which contains all hyper-parameters for the EA. TheDataset
class handles every operation regarding the data (splitting train and test partitions, inferring data types, etc.). You can run the Engine using a pre-constructed Dataset instance, or you can conveniently callfit(X, y)
, and it will try to create the dataset for you using some configurations you may specify in the Parameters class.Prototyping with Brush
There is still compatibility with DEAP to prototype different evolutionary algorithms. In fact, this compatibility is extended by creating binders to all classes from Brush:
Evaluator
,Selector
,Population
,Individual
, etc. Brush implements hashes for the individual and fitness classes in C++, making it possible to use the DEAP toolbox. The old NSGA2 implemented using the DEAP API is still there, and in the future, I hope I can create a notebook explaining how to prototype with Brush.Final remarks
While there is much work to do, I think it is already time to move all these significant changes to the master branch. We now have a fully functional evolutionary algorithm implementation in C++ with taskflow to handle parallelism. There is also a convenient wrapper for this implementation, and we are still compatible with DEAP. There are many TODOs written all over the place, and I intend to work on these in the next weeks.