Meaning of `expected_num_mutations` in segmented rep

SigmaX commented 10 months ago

I've committed an exception to check if a user has passed a value to expected_num_mutations that is greater than the length of a genome. This provides better feedback to users.

But this causes a test failure for segmented rep, where a test passes in a value of 4 for a genome with a flattened length of 4, but 2 segments:

def test_apply_mutation():
    """Applying segment-wise mutation operators with expected_num_mutations=len(genome) should
    result in every gene of every segment being mutated."""
    mutation_op = apply_mutation(mutator=genome_mutate_bitflip,
                                 expected_num_mutations=4)
    original = Individual([np.array([0, 0]), np.array([1, 1])])
    mutated = next(mutation_op(iter([original])))

    assert np.all(mutated.genome[0] == [1, 1]) \
           and np.all(mutated.genome[1] == [0, 0])

The logic of apply_mutation() is such that neither 2 nor 4 make sense in this test.

SigmaX commented 10 months ago

The deep bug is fixed—where we were computing a mutation probability from expected_num_mutations but passing it to the expected_num_mutations argument of the wrapper mutator (instead of its probability argument).

But we still have a very confusing interface: what does expected_num_mutations mean on an operator (apply_mutation()) that acts at the level of segments?

The docstring on the unit test above is written as if to imply that it calculates a mutation probability of expected_num_mutations/L where L is the length of a flattened genome.
But in actuality apply_mutation() computes p = expected_num_mutations/len(individual.genome), i.e. with L equal to the number of segments.

It is coincidence that in our unit test, both interpretations of L = 2. So the unit test does not disambiguate the meaning of the parameter.

To me, neither interpretation makes sense. We don't generally want to specify mutation rates that are relative to the number of segments. Passing arguments like expected_num_mutations through to the wrapped operators is also mind-rending.

Some history:

209 dealt with issues that arise from partial function applications with nested mutation operators.
96 proposes removing the pattern of passing parameters into apply_mutate() altogether.

SigmaX commented 10 months ago

Hypothesis: it's as simple as following the suggestion to remove the expected_num_mutations parameter from apply_mutation().

Will this work?

Any complications in terms of ensuring that mutation parameters such as expected_num_mutations are applying dynamically to segments of different sizes?
- A: No. Nested mutators can compute their own mutation rates as a function of their segments' length (which is what expected_num_mutations does normally for a mutator.
Any complications with the currying or closures that would be required for nested mutators to set their mutation rates?
- A: Not clear to me. I'll prototype it out and see if any problems arise.

SigmaX commented 10 months ago

Looking good w.r.t. (2). Our segmented_representations.ipynb notebook already had an example that uses currying to set the std parameter of a nested mutator. Handling mutation rates works the exact same way.

original = Individual(np.array([[0.0,0.0],[1.0,1.0],[-1.0,0.0]]))
print('original:', original)
mutated = next(apply_mutation(iter([original]),
                              mutator=genome_mutate_gaussian(std=1.0, expected_num_mutations=1.5)
                             )
              )
print('mutated:', mutated)

AureumChaos / LEAP

Meaning of `expected_num_mutations` in segmented rep #315

209 dealt with issues that arise from partial function applications with nested mutation operators.

96 proposes removing the pattern of passing parameters into `apply_mutate()` altogether.

AureumChaos / LEAP

Meaning of `expected_num_mutations` in segmented rep #315

209 dealt with issues that arise from partial function applications with nested mutation operators.

96 proposes removing the pattern of passing parameters into apply_mutate() altogether.

96 proposes removing the pattern of passing parameters into `apply_mutate()` altogether.