FFX creates a Pareto front of good models, trading off numBases against accuracy. For some applications we would like to be able to put up just one model as the champion, for example when FFX is being benchmarked against other techniques for symbolic regression, eg https://github.com/EpistasisLab/regression-benchmark.
Our current option (see api.py/FFXRegressor) is just to choose the model of highest accuracy/highest numBases. But there are at least two other options:
Try to find an "elbow" in Pareto front (idea: we are willing to give up a little accuracy for simplicity)
reserve some training data to use as a validation set, and choose the model with best accuracy on the validation set (idea: the more complex models may be overfitting and our goal is to avoid that).
FFX creates a Pareto front of good models, trading off numBases against accuracy. For some applications we would like to be able to put up just one model as the champion, for example when FFX is being benchmarked against other techniques for symbolic regression, eg https://github.com/EpistasisLab/regression-benchmark.
Our current option (see api.py/FFXRegressor) is just to choose the model of highest accuracy/highest numBases. But there are at least two other options: