ahmedfgad / GeneticAlgorithmPython

Source code of PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).
https://pygad.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.79k stars 451 forks source link

Additive random mutation with `gene_space` `low` and `high` #229

Open theo-brown opened 10 months ago

theo-brown commented 10 months ago

In the docs, it says that:

If a gene has its static space defined in the gene_space parameter, then mutation works by replacing the gene value by a value randomly selected from the gene space.

However, if a gene_space is defined as a dict with low and high parameters, it should be possible to generate additive mutations, so that new_value = old_value + random_value, rather than new_value = random_value.

I had a brief look in the source code, and this wasn't supported at the moment.

theo-brown commented 10 months ago

My default approach to this would be to use rejection sampling and Gaussian noise, but I'm not sure how well it aligns with the methods currently used for additive random mutations in PyGAD, which seem to rely on uniform sampling from a fixed range.

I suggest:

  1. Add new parameter, mutation_noise_variance
  2. Generate noise ~ N(0, 1)
  3. Set candidate_value = old_value + mutation_noise_variance*noise
  4. Set new_value = candidate_value if low < candidate_value < high else repeat from 2

Let me know if you'd be happy with this and I'll submit a PR.

Example code:

def mutate(offspring: np.ndarray, ga_instance: pygad.GA) -> np.ndarray:
    """Apply random Gaussian noise to a proportion of genes in the offspring.

    Parameters
    ----------
    offspring : np.ndarray
        Array of shape (sol_per_pop, num_genes) containing the offspring.
    ga_instance : pygad.GA
        Instance of the GA class
    """
    # Convert the bounds to numpy arrays
    lower_bounds = np.tile(
        np.array([b["low"] for b in ga_instance.gene_space]), (offspring.shape[0], 1)
    )
    upper_bounds = np.tile(
        np.array([b["high"] for g in ga_instance.gene_space]), (offspring.shape[0], 1)
    )

    # Select the genes to mutate
    # This generates a boolean array of the same shape as offspring, with True for each gene that will be mutated
    genes_to_mutate = rng.random(offspring.shape) < ga_instance.mutation_probability
    print("Total genes to mutate: ", np.sum(genes_to_mutate))

    # Generate random values to add to the selected genes
    # Use rejection sampling to ensure that the mutated genes remain within the bounds
    remaining_genes_to_mutate = genes_to_mutate[:]
    while np.any(remaining_genes_to_mutate):
        print("Remaining genes to mutate: ", np.sum(remaining_genes_to_mutate))
        # Generate candidate mutated values
        # This generates an array of floats with the same shape as offspring
        candidate_values = rng.normal(
            loc=offspring, scale=ga_instance.mutation_noise_variance, size=offspring.shape
        )
        # Accept only the candidates that are within the bounds
        candidate_is_valid = (candidate_values >= lower_bounds) & (
            candidate_values <= upper_bounds
        )
        # Accept only the candidates that are yet to be accepted
        candidate_is_valid &= remaining_genes_to_mutate
        # Update the remaining genes to mutate
        remaining_genes_to_mutate &= ~candidate_is_valid
        # Update the offspring array
        offspring = np.where(candidate_is_valid, candidate_values, offspring)

    return offspring
ahmedfgad commented 5 months ago

You are right. But this should be supported without adding additional parameters. The 2 parameters:

  1. random_mutation_min_val
  2. random_mutation_max_val

should be used to generate the random value.