Akiira / Comp-524-program

A genetic algorithm to create test suites that cover every branch of a program
2 stars 0 forks source link

Fitness scaling #10

Closed ambarket closed 9 years ago

ambarket commented 9 years ago

I guess this still needs some work. I changed things so that scalePopulationsFitness is never called if SCALING is set to NONE. Also I was testing with LINEAR scaling and noticed negative fitness values are possible, which break our selection.

Akiira commented 9 years ago

I fixed the negative values for linear scaling. However, I don't think its a good idea to not call the scalePopulationFitness function if no scaling is being used. Even if scaling is not being used most functions will be calling getScaledFitness to get the fitness value so these need to be set with the original fitness values. This is best done in the scale fitness function. It also makes the code more consistent and easier to read.

I agree though, that we should remove the loop from scaling function when the type is none. I'm not to worried about doing it right away because we are still coding and not focusing on optimization but perhaps we can just pass in the difference of the fitness of the organism being replaced. If scaling is being used this will be ignored, otherwise it will set total fitness += this value.

Akiira commented 9 years ago

Looking over the linear scaling, it can only be negative if the min fitness divided by the population is greater then 1, and this multiplied by the old fitness is greater then the max fitness. I'm surprised this happened with our very large population and small fitness. If you merge in the solution for it into your branch make sure you double check that the negative values are gone, to rule out it being something else.

ambarket commented 9 years ago

Here I was trying to not have to loop through and recalculate total fitness from scratch every time replacement is done if no scaling is being used. Also I was confused why we have a scaled fitness attribute at all. Why not just change the fitness value since the original will never be used. On Mar 18, 2015 11:28 AM, "Akiira" notifications@github.com wrote:

Looking over the linear scaling, it can only be negative if the min fitness divided by the population is greater then 1, and this multiplied by the old fitness is greater then the max fitness. I'm surprised this happened with our very large population and small fitness. If you merge in the solution for it into your branch make sure you double check that the negative values are gone, to rule out it being something else.

— Reply to this email directly or view it on GitHub https://github.com/Akiira/Comp-524-program/issues/10#issuecomment-83021518 .

ambarket commented 9 years ago

The negative issue seems to be fixed now. It occurred as I was testing the different suite size generation with minimum population sizes.

As the algorithm converges, this linear scaling causes the total fitness to drop to zero or very close to it. Which makes the selection occur randomly. Maybe this is what we cant because at that point there's no point in preferring one over the other. With small population size of 100 or so all the organisms get the same fitness pretty much from the start.

Also just found this, it appears the best and worst are switched. In case this doesn't always happen here's the parameters used to get this

Simulation* hiLoSim = new Simulation(1000, 10, 100, 2, .02, 1, 500000);

Generation # 13568 TotalFitness: 205994 Best Fitness: 1619 Best Fitness Scaled: 0 Worst Fitness: 1359 Worst Fitness Scaled: 260 Difference between best and worst: 260 Difference between best and worst Scaled: -260

Also in this case, they are scaling to the same value even though worst is less than best. Here's the parameters that should reproduce it.

Simulation* hiLoSim = new Simulation(10000, 10, 15, 2, .02, 1, 500000);

Generation # 83555 TotalFitness: 2330000 Best Fitness: 233 Best Fitness Scaled: 233 Worst Fitness: 154 Worst Fitness Scaled: 233 Difference between best and worst: 79 Difference between best and worst Scaled: 0

Akiira commented 9 years ago

The original value for fitness will be reused. When using fitness scaling the entire populations fitness needs to be rescaled every time a new organism is added. To do this rescaling we need to save its original fitness value.

Akiira commented 9 years ago

So from what i can gather the algorithm is not taking into account population size. This algorithm was probably meant to be used on different types of problems. When I have more time I will look into it more, I have a midterm at 6. I threw those in there without much research so its good to get rid of em now if they are bad. I will read through that paper Bui posted that discusses fitness scaling and try to implement that. Hopefully it will be more robust when it comes to our population sizes and such.

On the other issue though, I still think scaleFitness should be called no matter the scaling type. It makes the code more consistent and easier to read, and I suggested how to remove the loop in a previous post, though im sure there are many other ways as well.

ambarket commented 9 years ago

I guess I just don't like when a single method handles more than one use case and has arguments that aren't always used and dramatically change how it functions. Nevertheless I do agree that it's cleaner to always call it, and hopefully if we find a good scaling algorithm we will always use it so these alternative cases won't even exist.

Nevertheless I changed scalePopulationFitness to take both a pointer to the newOrganism and the replaced organism's fitness value as arguments. This way it can efficiently update the totalFitness and newOrganism's scaledFitness attribute in the case that fitness scaling is not being used.

Akiira commented 9 years ago

I removed the arguments from the scaling function and renamed it to updatePopulationFitness. It should now satisfy both our requirements of focusing on one thing, no unused arguments, and still have consistent clean code around where it is called.

Akiira commented 9 years ago

Also linear scaling is fixed now. The actual reason why it wasn't working was because it was called before the population array was sorted, this gave it incorrect values for max and min, thus the negative values.

ambarket commented 9 years ago

I think this has been resolved for a while now.