marcovirgolin / GP-GOMEA

Genetic Programming version of GOMEA. Also includes standard tree-based GP, and Semantic Backpropagation-based GP
Apache License 2.0
49 stars 23 forks source link

constant in symbolic #13

Closed omidr1370 closed 3 years ago

omidr1370 commented 3 years ago

Hello, Thank you for your SR method.

Is there any way to not get the constant in the result? For example, I received this equation: 0.011223+-0.012943*(((x3+(x8+(x7+x2)))-((x2+(x8+x1))+((x3+x4)-(x3-x3)))))

How can I change the settings not to receive the first sentence(0.011223)? In other words, I want all the values to be multiplied by input features.

Thanks.

marcovirgolin commented 3 years ago

Hi!

Yes, you can evolve solutions that have no constants.

1) The two terms you refer to (in this case, 0.011223 & 0.012943) are an affine transformation known as "linear scaling". To disable it, set "linearscaling=False".

2) Constants can also appear within the expression if ephemeral random constants (ERCs) are enabled. To disabled them, set "erc=False".

omidr1370 commented 3 years ago

Thanks for the swift reply.

When I disable "linear scaling," I can't see any constants. I want constants, but I don't want them to appear alone. I want my equation to be in the form of : C0X0+C1X1+C2X2+..... I don't want to receive the equation in this form: C0+C1X1+C2X2+..... or X0+X1+..... (I receive such an equation when I disable the linear scaling.)

C0,C1,C2,... are constants and X0,X1,X2,... are input features.

Is there anyway to receive the equation in my desired format?

marcovirgolin commented 3 years ago

It is not possible with the current code version to obtain an equation that is exactly as you ask, i.e.,

C0X0+C1X1+....+CnXn.

But please note: if you mean to have exactly this equation form, i.e., a linear combination of the input features each weighed by some constant, then you should use a linear regression algorithm instead of a symbolic regression one!

Linear scaling (option 1 mentioned above) adds an intercept constant but you can modify the code (starting from src/Fitness/SymbolicRegressionLinearScalingFitness.cpp so that no intercept is computed).

Otherwise, if you use ERCs (option 2 mentioned above) then genetic programming will use constants during the search process in an arbitrary way: if they will appear in the final solution, and where they will appear, depends on the evolutionary process.

I hope this helps!