Closed mmastrac closed 2 weeks ago
I'll look into this. I see a DiscreteGenotype with a fully enumerated range of possible values would probably not work well in this case, as the mutations would jump all over the place.
Using a discrete allele was my first attempt and it took a moment to understand why I wasn't getting the right behaviour! It could also be possible to have a mutation distance there, I suppose.
My workaround for now is using floats and an explicit casting step, but there's probably a lot of extra work happening with genes that round to the same value.
Apologies for so many reports -- I found some time tonight to collect my thoughts after using it this last few weeks. The library's API is excellent and the code is very understandable.
No problem. It is great to get some feedback. I think your suggestion of a mutation distance would work best. I'll look into it.
I'm thinking; a mutation distance is basically the same as the concept of a neighbouring allele (which is implemented already in the hill-climb strategy). So you basically end up using a hill-climbing strategy instead of an evolution strategy. The initial population will behave like an evolution strategy, but it will quickly collapse to a hill-climbing strategy once the population becomes relatively uniform. Hill climbing is much faster than evolution, but much more likely to end up in a local maximum/minimum. Have you tried using the hill-climbing strategy for your use case?
See: HillClimb
Maybe there is still added value to implementing this as a mutation type, creating a hybrid evolution/hill-climb strategy. The initial evolution strategy would give a better spread of the solution space and you are less likely to end up in a local maximum/minimum and the hill-climb stage at the end would zoom in efficiently to the local solution.
What do you think of this?
Using a discrete allele was my first attempt and it took a moment to understand why I wasn't getting the right behaviour! It could also be possible to have a mutation distance there, I suppose.
I looked into the implementation and both Discrete and Continues mutate completely randomly within respectively the set or range. So I don't immediately see why the discrete approach would behave worse than the "continues bin" approach. Can you elaborate a bit on your findings here as well?
Oh! -- maybe this isn't working as expected. I was expecting that the neighbor ranges of the continuous allele were used as a maximum mutation distance, rather than the set of items chosen.
This might explain some of the results I was seeing.
My expectation from the API would be that a mutation was a distance, rather than an absolute sample. I think I must have been imagining the improvement because I can clearly see how it's not working that way :).
I think that allowing for a mutation-distance-distribution could allow for the best of both worlds -- shorter hops would work like hill-climbing, longer hops more like major mutation events.
I can ponder this a bit. In my case I've got islands of stability, so optimization needs to alternate between randomly throwing itself around the solution space and hill-climbing. It might be worth using evolve to generate candidates for hill-climbing.
I will look at adding ContinuousGenotype::\<u8> (and friends as generic) and allow the concept of distance there. In addition I think we should maybe rename DiscreteGenotype to EnumGenotype (to express there is no distance, so distance would behave as random).
I will rename the MutateOnce to MutateSingleAlleleRandom (or something like that). And align MutateTwice to MutateMultiAlleleRandom with a parameter how many. When this is done we can add MutateSingleAlleleDistance with a distance parameter (some uniform max distance of the type). This way random and distance are neatly separated.
@mmastrac do you think this would be the right way to go?
This would be fantastic! I should try creating a custom mutation for my use case to test a few of these ideas out, but I'm pretty sure that MutateSingleAlleleDistance
would give me what I need.
Also your requested Regular reporting callbacks during evolution #4 would probably by quite useful for seeing how the path to the solution behaves.
@mmastrac I added MutateSingleGeneDistance, only usable by ContinuousGenotype for now. It takes a allele_distance_range which bounds the mutation (instead of the full allele_range in MutateOnce). It is ~5x more efficient in the example/evolve_continuous (but that is a very simple example with a single very directed solution space). You can try this one out. Can you use main, or do you need a published version?
Adding ContinuousGenotype::\<u8> (and friends as generic) is quite complicated, so please stick with ContinuousGenotype with implicit f32 for now. With the proper ranges and mutation distance, this should not be very inefficient with regards to a u8 implementation.
I guess Regular reporting callbacks during evolution #4 should be next for feedback and testing purposes. As I still don't quite understand why Discrete and Continuous genotypes had such a different performance in your use case.
There was an error in the implementation. With fix it is only ~4x more efficient in example/evolve_continuous.
Released v0.9.0 which implements any num type. See CHANGELOG for the many changes which resulted from this generalization.
Notable changes for issue context:
Allele
trait: Allele: Clone + Send + Sync + PartialEq + Debug
Genotype
to Allele
for: Fitness
& EvolveReporter
ContinuousGenotype
to RangeGenotype
RangeGenotype
for numeric types (default still f32, but other float and integer types are now supported)Range
with RangeInclusive
for all ranges in RangeGenotype
in order to handle integer ranges more intuitivelyGenotype
with random, relative or scaled mutations options (relative and scaled only possible for RangeGenotype)
MutateSingleGeneRandom
to MutateSingleGene
as it just calls mutate_chromosome()
on Genotype
MutateSingleGeneDistance
as random, relative or scaled mutations are now handled by Genotype
and not the callerallele_neighbour_range
to allele_mutation_range
in RangeGenoype
to define relative mutationallele_mutation_scaled_range
to RangeGenotype
to define scaled mutationCan't wait to try it out -- hopefully soon! Thanks for all the awesome work on this library.
I've been using this library to tune constants for a connection pool and it's been quite useful. One small nitpick is that there's no concept of a continuous-but-discrete allele for integral values that may be useful in places like this.
Ideally ContinuousGenotype would take a num-like parameter like so: