Closed bjuergens closed 3 years ago
consider these two experiments. They are identical except the former used a fixed random seeds and the former didn't.
In the former each individual was evaluated on a different seed. Thus in the former there are some individuals, which gained a good score simply because they received a lucky seed with a simple environment, thus the maximum score in the former experiment are much higher.
While in the latter each individual in a generation was evaluated on the same seed. Thus individuals within a generation are better compatible to each other as their fitness relies less on luck and more on ability. Thus this experiment produced better individuals, which results in a higher validation curve.
here is also a version with MU_ES and fixed seeds:
it's largely the same as the the CMA_ES version. except it's more erratic, has a little steeper growth in the beging. Interestingly the val_fit curve has its plateau on the same fitness as with the CMA-version, but in the MU_ES version the average fitness is much higher than with CMA_ES.
I conclude, that the graphs are very different for CMA_ES and MU_ES for all lines except val_fit. But since val_fit is about the same, their actual training quality is the same
i conclude, that fixing seeds is indeed beneficial.
And additionally when not fixing seeds, than the validation-curve even sinks in the end, which I suppose is because the population tries to optimize abusing lucky seeds, instead of generally good performance.
will be easy to do once https://github.com/neuroevolution-ai/NeuroEvolution-CTRNN_new/issues/39 is implemented