Program-Trace-Optimisation / PTO

Program Trace Optimisation
GNU General Public License v3.0
2 stars 0 forks source link

Use a fixed threshold in SR generator #2

Open jmmcd opened 1 month ago

jmmcd commented 1 month ago

The Symbolic Regression generator has:

`if rnd.random() < len(term_set)/(len(term_set)+len(func_set)):`

With this, the probability of choosing a terminal gets larger when we have more variables (which are in term_set). Choosing more terminals makes the tree smaller.

We could rewrite this to a different form, eg use a fixed probability such as 0.5, to avoid this effect.

amoraglio commented 1 week ago

This generator is implemented this way because this is the standard GP growth mutation.

I agree that this unnecessarily couples the number of variables and (expected) tree depth.

Sure we could use a fixed probability.

I am guessing this must have been noticed before by GP researchers, and wondering what is the 'standard' solution?

On Sun, 20 Oct 2024 at 20:29, James McDermott @.***> wrote:

The Symbolic Regression generator has:

if rnd.random() < len(term_set)/(len(term_set)+len(func_set)):

With this, the probability of choosing a terminal gets larger when we have more variables (which are in term_set). Choosing more terminals makes the tree smaller.

We could rewrite this to a different form, eg use a fixed probability such as 0.5, to avoid this effect.

— Reply to this email directly, view it on GitHub https://github.com/Program-Trace-Optimisation/PTO/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCZ474OXMDS3MACUIGADP3Z4QAARAVCNFSM6AAAAABQIZNTZGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYYDAOBRGAYDGOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>