Closed jakobj closed 3 years ago
Sounds good to me. Are you suggesting that we should implement NES in this library or publish it in another library and link to that?
hmm, this is a tough question. my first feeling was to just implement it here, but if we can keep these modules separate that would of course be better in terms of general usability. however it would require maintenance of an additional library.
do you have any preference?
I think that I'd prefer to implement NES in a separate library, or find a library has already done that.
ok, then let's do that. do you want to take care of this?
Hi, so, I changed my mind and think that we should add the NES algorithm to hal-cgp
rather than maintaining another external library. I am willing to do that, however, I see one issue:
NES requires both a mean and a variance term for parameters, because it operates with search distributions. In our current implementation, the parameterized nodes only have point estimates of parameters. Do you think we should add versions of the parameter nodes with mean and variance terms for each parameter? Or modify the existing nodes?
Hi, so, I changed my mind and think that we should add the NES algorithm to
hal-cgp
rather than maintaining another external library.
ok, sounds great! :+1:
I am willing to do that, however, I see one issue:
NES requires both a mean and a variance term for parameters, because it operates with search distributions. In our current implementation, the parameterized nodes only have point estimates of parameters. Do you think we should add versions of the parameter nodes with mean and variance terms for each parameter? Or modify the existing nodes?
that is a good point! i would always use the mean as the value for the parameter. however, we need to think about whether we want to store the variance from one local search to the next, such that it is kept over generations instead of being reinitialized every time. would this make sense? consider the case where the expression has changed: would using the previous variance make sense as an initial value for the current local search step? intuitively i would say: sometimes yes, sometimes no, so maybe using the (user defined) initial value each time is an equally good approach with less bookkeeping?
a possible external library solution: https://github.com/rasmusbergpalm/evostrat ?
Could differentiable genetic programming be an alternative to NES? If I understand the it correctly, the key of this approach is that it does not require differentiable objectives (despite the irritating name). There is an implementation of the underlying automatic differentiation here, which might be useful.
it could be an alternative, but i thought it does need differentiable objectives? did i miss something? how would they obtain gradients otherwise?
Ah yes, you are right. I misread the part where the objective function still must use the graph represented function explicitly. Nevermind, I will argue the use of dCGP in another context ^^
@mschmidt87 fyi i'm working on an implementation here: ~https://github.com/jakobj/hal-cgp/tree/enh/evolutionary-strategies~ https://github.com/jakobj/hal-cgp/tree/enh/local-search-es, surprise: it's not as easy as i thought :P
The currently available local search strategy can only be applied to objectives that are differentiable. We should provide an alternative method that does not require this. For simplicity (and legacy ;) ) reasons I would suggest to implement natural evolution strategies.