Open jmmcd opened 1 month ago
This generator is implemented this way because this is the standard GP growth mutation.
I agree that this unnecessarily couples the number of variables and (expected) tree depth.
Sure we could use a fixed probability.
I am guessing this must have been noticed before by GP researchers, and wondering what is the 'standard' solution?
On Sun, 20 Oct 2024 at 20:29, James McDermott @.***> wrote:
The Symbolic Regression generator has:
if rnd.random() < len(term_set)/(len(term_set)+len(func_set)):
With this, the probability of choosing a terminal gets larger when we have more variables (which are in term_set). Choosing more terminals makes the tree smaller.
We could rewrite this to a different form, eg use a fixed probability such as 0.5, to avoid this effect.
— Reply to this email directly, view it on GitHub https://github.com/Program-Trace-Optimisation/PTO/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCZ474OXMDS3MACUIGADP3Z4QAARAVCNFSM6AAAAABQIZNTZGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYYDAOBRGAYDGOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
The Symbolic Regression generator has:
With this, the probability of choosing a terminal gets larger when we have more variables (which are in
term_set
). Choosing more terminals makes the tree smaller.We could rewrite this to a different form, eg use a fixed probability such as 0.5, to avoid this effect.