Symbolic Identification of Non-linear Dynamics. The method generalizes the SINDy algorithm by combining sparse and genetic-programming-based symbolic regression.
In order to reduce not only the number of linearly-combined expressions, the algorithm could use, as a fitness, not only the score coming from sindy, but also an additional term that measures how sparse the solution is at the deeper level, looking into f_0, f_1...
You could do this for example counting the number of nodes of each subindividual associated to a non-zero linear parameter coming from sindy, then sum all these numbers together to get a sparsity weight in the combinatorial space, and then return to the gp part of the algo the sum (not sure about the sign here) of the model.score and a weighted value of the sum of number of nodes. Weight do be determined, considering the behaviour of the sindy score.
Alternative solution: deal with the problem as a multi-objective optimization problem, and produce a Pareto front of best-fitting vs sparse solutions so that the user can chose the level of sparsity desired.
In order to reduce not only the number of linearly-combined expressions, the algorithm could use, as a fitness, not only the score coming from sindy, but also an additional term that measures how sparse the solution is at the deeper level, looking into f_0, f_1...
You could do this for example counting the number of nodes of each subindividual associated to a non-zero linear parameter coming from sindy, then sum all these numbers together to get a sparsity weight in the combinatorial space, and then return to the gp part of the algo the sum (not sure about the sign here) of the model.score and a weighted value of the sum of number of nodes. Weight do be determined, considering the behaviour of the sindy score.
Alternative solution: deal with the problem as a multi-objective optimization problem, and produce a Pareto front of best-fitting vs sparse solutions so that the user can chose the level of sparsity desired.