erp12 / pyshgp

Push Genetic Programming in Python.
http://erp12.github.io/pyshgp
MIT License
74 stars 23 forks source link

Add max_genome_size parameter #160

Open Y1fanHE opened 3 years ago

Y1fanHE commented 3 years ago

I have tried to add max_genome_size parameter in the variation steps. This parameter can be specified in the estimator.

I created a method fix() in the VariationOperator class. Also a method produce_and_fix() as follows.

def fix(self, child: Genome, max_genome_size: int) -> Genome:
    if len(child) > max_genome_size:
        child = child[:max_genome_size]
        return child
def produce_and_fix(self, parents: Sequence[Genome], spawner: GeneSpawner, max_genome_size: int) -> Genome:
    return self.fix(self.produce(parents, spawner), max_genome_size)

Inside the algorithms, you can use produce_and_fix() instead of produce(). I am not sure whether you are happy with the implementation, tell me if you have any suggestion.

erp12 commented 3 years ago

Thanks for taking the time to implement this feature! I appreciate you dedicating your time to this project. I think your design is great. I only have a couple of suggestions.

  1. I think it would be helpful to use a more specific name than fix. Perhaps limit or truncate. Also the "fix" method doesn't seem like it should be called outside of the VariationOperator class and sub-classes so it should probably be made private (ie. _fix, _limit, _truncate, or whatever.)

  2. I'm not sure if there needs to be a separate method for returning fixed genomes. To address this I would would recommend the doing the following:

    1. Rename the current produce method to _produce in VariationOperator and all of its sub-classes.
    2. Rename produce_and_fix to produce (that method will only be on the base class).
    3. Change the max_genome_size argument of the new produce to max_genome_size: Optional[int] = None.
    4. In the new produce method call produce and truncate the child genome if max_genome_size is not None.

This way the implementations of GeneticAlgorithm and SimulatedAnnealing can always call produce regardless if a max genome size is provided or not.


It looks like this PR contains the commits from your last PR. Those have already been squashed and merged into master. Can you drop them from this PR and update your branch to the latest version of master?

As an aside, it is often easiest to do the work for each PR on a separate branch. I recommend this guide.

Y1fanHE commented 3 years ago

I have made the update based on your comments and opened a new PR #161