[Noisy NSGA2] Embedding method gives solutions with too few replications

JusteRaimbault commented 5 years ago

I came across this several time: when running a noisy nsga2, some solutions have too few replications ; but most of the time a filtering of the final population on number of replications does the trick. It sometimes however becomes problematic (as we were discussing with @mathieuleclaire this morning), when very few solutions have more than, let say 10 replications, and these are relatively bad regarding the optimization objective. This is due to the way we handle stochasticity, by embedding the problem into a higher dimension by adding an objective as a decreasing function of the number of replications. Work is still in progress (here : https://github.com/openmole/mgo-benchmark ) to test this method against other ways to handle stochasticity, such as fixed number of replications, adaptive sampling, Kalman-based uncertainty computation, RTEA algorithm, etc. So for the moment I would advice

either to go with the fixed replications method : wrap your stochastic model into a Replication task and go with a deterministic NSGA2; this should be fine if the computation budget allows it.
add a parameter for a minimal number of replication per points (we have a max only for now) ; this would be a "patch" but ensure no outlier with 1 replication ends in the final population (10 replications e.g. remains cheap but ensure a bit more robustness). This is however against the philosophy of the embedding method (find compromises between optimization objective and confidence in the solution). Would be interested to hear your thoughts on that !

mathieuleclaire commented 5 years ago

Having a discussion later with @romainreuillon, it seems a bug has been introduced recently with the last code refactoring. It is not a normal behaviour to have so few replications for good candidates. We have to investigate this first before adding new features.

JusteRaimbault commented 5 years ago

Ok let see if the bug can be fixed ; that's true that even with a reevaluation rate at 0.2 by default these solutions should be quickly removed. However I do not think this solves the big picture ; the way we handle it with a Pareto front I think you can end in some situations where noise is fat-tailed and such "one-replication" solutions end in the Pareto front ; once they are reevaluated they are dominated and removed, but appear again later through mutation/cross-over.

romainreuillon commented 5 years ago

I think that I found an explanation for this bug. It is not trivial to fix though.

openmole / mgo

[Noisy NSGA2] Embedding method gives solutions with too few replications #8