Open dts333 opened 6 years ago
Couple of things to check first.
Are you sure you are doing maximization?
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
Is your evaluation determinist?
Check and check. I named "FitnessMax" something else, but I assume that doesn't matter. The evaluation is deterministic: toolbox.evaluate(hof[0])
gives 0.45 no matter how many times I run it
What are you primitives?
def protectedDiv(numerator, denominator):
try:
return numerator / denominator
except ZeroDivisionError:
return 1
pset = gp.PrimitiveSet("Main", arity=5)
pset.addPrimitive(operator.add, 2)
pset.addPrimitive(operator.sub, 2)
pset.addPrimitive(operator.mul, 2)
pset.addPrimitive(protectedDiv, 2)
Since everything appears to be standard so far, I am affraid that you will have to either share your code or submit a toy version of it that presents the same issue. Otherwise, we currently do not have enough information to correctly pintpoint the source of the problem.
Can you provide a simplified version of your evaluation function?
Here's the evaluation function. Each individual has five genes, each of which contain one tree. It's supposed to look at five data points each over ten timepoints for a number of data sets, and then predict which data set will increase by the most on the next timepoint
def evaluate(individual, data):
func0 = toolbox.compile(expr=gp.PrimitiveTree(individual[0][0]))
func1 = toolbox.compile(expr=gp.PrimitiveTree(individual[1][0]))
func2 = toolbox.compile(expr=gp.PrimitiveTree(individual[2][0]))
func3 = toolbox.compile(expr=gp.PrimitiveTree(individual[3][0]))
func4 = toolbox.compile(expr=gp.PrimitiveTree(individual[4][0]))
funcs = [func0, func1, func2, func3, func4]
def eval(x):
score = 0
for i in range(5):
gene = individual[i]
score += funcs[i](*(x.iat[5 * gene[j + 1][0] + gene[j + 1][1]] for j in range(5)))
return score
scores = pandas.DataFrame(data.apply(eval, axis=1))
scores.columns = ['scores']
top_scores = scores.groupby('timestamp')['scores'].transform(lambda x: x == x.max())
fitness = TrainingResultData.loc(axis=0)[top_scores.values]
fitness = fitness.product()
return (fitness,)
I am running into a very similar issue: having run the optimization using
pop, logbook = algorithms.eaMuPlusLambda(...)
I am finding notable discrepancy between the values of the fitness calculated using toobox.map and toolbox.evaluate manually iterated over the population:
pop_fitnesses_tbmap = np.array([pf[0] for pf in toolbox.map(toolbox.evaluate, pop)]) pop_fitnesses_tbeval = np.array([toolbox.evaluate(ind)[0] for ind in pop]) pop_fitnesses_direct = np.array([direct_eval(ind) for ind in pop])
The first line gives numbers that are nothing like the second and the third, which are identical, as expected (direct_eval function is "registered" via 'toolbox.register("evaluate",direct_eval)):
print(np.sum(np.abs(pop_fitnesses_tbmap-pop_fitnesses_tbeval)))
15604327.392578125
print(np.sum(np.abs(pop_fitnesses_direct-pop_fitnesses_tbeval))) 0.0`
Hi, I'm fairly new with this so apologies in advance.
So I ran
with natural fitness, and the log showed fitness maxing out at 7.90 in gen 13, before dropping back to 6.37. Afterwards, I ran
evaluate(hof[0], TrainingData)
and got 0.45. I got the same results fromtoolbox.evaluate(hof[0])
andHowever,
hof[0].fitness.values
still reads 6.37. I've tried this multiple times, so I don't think that the issue is any of the pandas code in my evaluate function altering data frames. I also added cloning to each of my mutators in case some of them were altering the individuals after fitness had been registered, but since cloning is already in varAnd, as you might expect, all this did was make it run slower. I've been looking for what's causing this all day and I'm running out of ideas, so any help would be appreciated. This is a genetic programming project in case that's relevant