This PR implements multi-armed bandits to learn sampling probabilities for terminals, operators, and variation operations.
It also implements an archive (Issue #58).
A new classification metric, average_precision_score, is also implemented (partially solving Issue #57), which was hard to do, especially because I needed it to work with lexicase selection based on single-test evaluations. Currently, lexicase is still based on log loss, but the survival, archive, and final individuals are picked with the average precision score.
It also implements complexity and linear complexity.
I added a feature to abide by feature types from pandas data frames on the Python side (Issue #56). However, this feature has caused me to notice some undesired conversions. Sometimes, I find myself using X.values to avoid copying datatypes from pandas.
There is a progress bar now (Issue #20) and different verbosity levels.
I fixed the mean label so it is no longer weighted, and there have also been many improvements in the logistic nodes for classification problems.
Some new unit tests (mainly in cpp) were implemented.
Performance is not my main concern right now, and I recognize several new TODOs and improvements, but I will focus on that later.
This PR implements multi-armed bandits to learn sampling probabilities for terminals, operators, and variation operations.
It also implements an archive (Issue #58).
A new classification metric, average_precision_score, is also implemented (partially solving Issue #57), which was hard to do, especially because I needed it to work with lexicase selection based on single-test evaluations. Currently, lexicase is still based on log loss, but the survival, archive, and final individuals are picked with the average precision score.
It also implements complexity and linear complexity.
I added a feature to abide by feature types from pandas data frames on the Python side (Issue #56). However, this feature has caused me to notice some undesired conversions. Sometimes, I find myself using X.values to avoid copying datatypes from pandas.
There is a progress bar now (Issue #20) and different verbosity levels.
I fixed the mean label so it is no longer weighted, and there have also been many improvements in the logistic nodes for classification problems.
Some new unit tests (mainly in cpp) were implemented.
Performance is not my main concern right now, and I recognize several new TODOs and improvements, but I will focus on that later.