Closed Smantii closed 2 weeks ago
There are several separate issues that lead to operon producing this expression which is longer than 20 symbols. The main reason is that the internal representation for expressions is different from the representation chosen for the output.
length
. The solutions also have a model_complexity
which also includes the coefficients and corresponding multiplication symbols. Note that both measures do not count the linear scaling terms (see 2.). https://github.com/heal-research/pyoperon/blob/7130a875b5565a28403df1746822107412d455ee/pyoperon/sklearn.py#L565-L573length
and model_complexity
. Linear scaling can be turned off, alternatively reduce the length limit by four nodes.The expression:
(0.0145737370 + (0.9973683357 * (((2.1865003109 * X3) / (sqrt(1 + (2.9363973141 * X2) ^ 2))) - ((sin(((0.7525053024 * X2) + ((-0.2822726369) * X4))) * ((-0.1257296056) * X4)) - ((((4.4178400040 * X2) + ((-2.0412197113) * X1)) + (2.8924779892 * X4)) * ((0.0185590051 * X1) - ((-0.0078712795) * X3)))))))
is internally represented as:
offset + scale * (aq(c1_x3, c2_x2) - (sin(c3_x2 + c4_x4) * c5_x4 - ((c6_x2 + c7_x1 + c8_x4) * c9_x1 - c10_x3)))
Ignoring the 4 symbols for offset and scale this has exactly 20 symbols.
Thank you so much @gkronber, your answer clarifies everything
Hi,
I am sorry to ask questions (again) through an issue. I take this opportunity to thank all the Operon team because this library is great! I have a doubt related to the definition used for
length
andmax_length
. In the Operon paper the length of an individual is defined as the number of nodes in its tree representation. But, when I ran the following exampleI got the following output
and the final expressions has length higher than 20. I am sure that I am missing something, can you help me to understand what to do you mean by
model_length
?