sisl / ExprOptimization.jl

Algorithms for optimization of Julia expressions
Other
44 stars 11 forks source link

Question: Optimal parameters for GE #16

Closed flare9x closed 5 years ago

flare9x commented 5 years ago
Random.seed!(1)
p = GrammaticalEvolution(grammar,:Real,1000,20,10,10,6,0.2,0.4,0.4; select_method=GrammaticalEvolutions.TruncationSelection(300))
results_ge = optimize(p, grammar, :Real, loss)
(results_ge.expr, results_ge.loss)
grammar::Grammar: grammar
typ::Symbol: start symbol
pop_size::Int: population size
iterations::Int: number of iterations
init_gene_length::Int: initial length of genotype integer array
max_gene_length::Int: maximum length of genotype integer array
max_depth::Int: maximum depth of derivation tree
p_reproduction::Float64: probability of reproduction operator
p_crossover::Float64: probability of crossover operator
p_mutation::Float64: probability of mutation operator
select_method::SelectionMethod: selection method (default: tournament selection)
mutate_method::InitializationMethod: mutation method (default: multi-mutate)

Is there any formal process using the Grammatical Evolution example on how the parameters were chosen to suit the example problem?

What is maximum depth and how does that affect the GE process? What are the implications of setting low / high probabilities of mutation / crossover / reproduction? Also init_gene - how does that affect the GE process?

Be nice to have author response but happy to be sent to external source!

Thanks! Andrew

rcnlee commented 5 years ago

Hi Andrew,

These parameters are generally tuned by hand and I am not aware of any formal way to set them. It is useful to understand how the underlying algorithm works since that will give some intuition as to the effects of the parameters.

The maximum depth limits the depth of the derivation tree of expressions. When expanding expressions, non-terminals are replaced using production rules, which themselves can contain non-terminals. The maximum depth limits the number of recursive calls to avoid infinite recursion. For example, (A and A) can be expanded (A and (A and A)) can be expanded (A and (A and (A and A))) by calling itself. Max depth limits the levels of nesting. Crossover/mutation/reproduction probabilities should add up to 1. Setting these is kind of an art. You can look at the diversity of the expressions in your population at each generation. If it seems like the population doesn't have a lot of diversity, you might want to try upping the crossover/mutation rates, which introduces new individuals to the mix. Conversely, there is a benefit to keeping individuals around for at least a few generations so that they get to demonstrate their worth (which the reproduction rate helps). Init_gene_length is the initial length of the integer array that is the intermediate representation in GE. The gene length can grow to max_gene_length. If you derive a big expression and run out of gene length, the expression will fail, so you want to make sure you have enough for the expressions that you're interested in. If your gene length is too long, then it wastes memory and slows computations, but more importantly, it reduces the effectiveness of the genetic operators. To pick a reasonable number, you can draw the derivation tree of a large expression that you might expect to see, and count the number of non-terminals, then add a little buffer.

This handbook has a lot of good info. Tutorials specifically targeting GE are also good. http://www0.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/poli08_fieldguide.pdf

flare9x commented 5 years ago

Many thanks!