Open ianhbell opened 7 years ago
Sounds interesting @ianhbell ... Got a citation in mind?
This should be a good point to start reading: https://www.iitk.ac.in/kangal/Deb_NSGA-II.pdf
Hi, @ianhbell
Just for ciriosity, if I define a complexity
measure (yielding the number of nodes in the tree representation of an expression), and use this complexity measure inside my custom fitness
, a bit like so
from sklearn.metrics import r2_score
def my_custom_fitness(expr, X, y_true):
y_pred = make_prediction(expr, X)
return r2_score(y_pred, y_true) - (complexity(expr) / 1000)
Therefore:
my_custom_fitness
would favour the simplest onemy_custom_fitness
would the one that yields the best r2_scoreGiven these properties, the expression found at the end of fit
would be on the pareto front (at least the one drawn considering all evaluated expressions)
Am I missing something ?
Answering to myself with a reference
PARETO-FRONT EXPLOITATION IN SYMBOLIC REGRESSION
Written at page 294 :
There is, however, a significant difference between using a Pareto front as a post-run analysis tool vs. actively optimizing the Pareto front during a GP-run. In the latter case the Pareto front becomes the objective that is being optimized instead of the fitness (accuracy) of the “best” mode
So yes, I was missing something big
Has there been any thought given to pareto front optimization? There's always a tradeoff between tree size and model fidelity, which I gather you handle with parsimony. But the other alternative is to keep any model that is non-dominated by the pareto front. I couldn't see any clear way of hacking that into gplearn.