pkalivas / radiate

A genetic programming engine which evolves solutions through asynchronous speciation.
MIT License
147 stars 16 forks source link

Documentation for Config fields #8

Closed rsdy closed 4 years ago

rsdy commented 4 years ago

I'm playing around with radiate, and I'm a bit confused as to what the population config means. E.g. how does distance impact my performance? Where is it used? What does species_target mean during an optimization?

Thanks for the library, it was really easy to get started and get some results!

pkalivas commented 4 years ago

Hey, I should probably add more docs around those - there's a lot of little knobs you can turn.

The goal of the Config struct is essentially to control genetic diversity within the population. It will control how Niches are created and how broad their scope is. This allows the engine to evolve different types of solutions to the same problem in order to find the optimal solution. This article gives a much more in-depth explication of why this is necessary for genetic algorithms and some other solutions, if you are curious.

Each generation each genome is analyzed according to their genetic make-up, think people. That genetic make-up is given a score which is then used to figure out how similar those genomes are. This is how the Niches are created. Each Niche is a collection similarly structured genomes which evolve together. In order to do this there needs to be some sort of cut-off criteria where we say "these two genomes are too different to be considered part of the same Niche" - this is the distance parameter.

So back to the people analogy, say we are scoring people off hair color. If we have three people, one blonde hair with a score of 0, one black hair with a score of 5, and one brown with a score of 3. How do we decide where that brown hair person belongs? Their hair isn't exactly black, but it isn't blonde. If we set the distance to 1, well then they would become a brand new Niche of people with brown hair because abs(0 - 3) = 3 which is greater than the distance, similarly the abs(5 - 3) = 2 which is also greater than the defined distance. But if we set the distance to 3, then the brown hair person would fall within the black hair Niche because the abs(5 -3) = 2 which is less than the distance, thus they would fall within the black hair Niche (bucket).

Often times very complex structures are being evolved through these algorithms (Neat for instance) and the number of Niches can explode or become stuck at 1 or 2. This is where the target_species and dynamic_distance come into play. To keep the number of Niches in check, we define a target species, say 5. This would mean we want there to be around 5 Niches. So if all of a sudden we have a genetic explosion where 100 Niches are created, the dynamic distance will tell the population "that's way too many, lets expand the distance so we have less Niches". If there are too many Niches, a solution will be hard to find because each Niche isn't permitted to evolve its structure to its optimal point. Similarly, if there are too few then we have a high risk of not finding an optimal solution because favorable mutations will be lost before they are allowed to optimize.

Hopefully that helps, let me know if you have any other questions or I'm not being clear enough. Glad you're finding success!

rsdy commented 4 years ago

That's a really good explanation, thanks! 🙏