CodeReclaimers / neat-python

Python implementation of the NEAT neuroevolution algorithm
BSD 3-Clause "New" or "Revised" License
1.42k stars 493 forks source link

Implementing NEAT to retraining after each prediction #147

Open Kuselokusi opened 5 years ago

Kuselokusi commented 5 years ago

Hi, I would like to understand if it is possible to set up neat python through the configuration file to retrain after each prediction of the test/unseen set. For instance if the XOR "evolve-minimal" example, from my understanding it can be adjusted so that it trains on part of the data (to a particular fitness level, obtaining the best genome) then it predicts on the other data that was set aside as a test set. See the code below to see what I mean:

from __future__ import print_function
import neat
import visualize

# 2-input XOR inputs and expected outputs. Training set
xor_inputs = [(0.0, 0.0, 0.0), (0.0, 1.0, 0.0), (1.0, 1.0, 1.0), (0.0, 0.0, 1.0), (1.0, 1.0, 0.0)]
xor_outputs = [(1.0,), (1.0,), (1.0,), (0.0,), (0.0,)]

# Test set
xor_inputs2 = [(1.0, 0.0, 1.0), (1.0, 1.0, 0.0), (1.0, 0.0, 0.0)]
xor_outputs2 = [(1.0,), (0.0,), (0.0,)]

def eval_genomes(genomes, config):
  for genome_id, genome in genomes:
    genome.fitness = 5
    net = neat.nn.FeedForwardNetwork.create(genome, config)
    for xi, xo in zip(xor_inputs, xor_outputs):
      output = net.activate(xi)
      genome.fitness -= (output[0] - xo[0]) ** 2

# Load configuration.
config = neat.Config(neat.DefaultGenome, neat.DefaultReproduction,
                     neat.DefaultSpeciesSet, neat.DefaultStagnation,
                     'config-feedforward')

# Create the population, which is the top-level object for a NEAT run.
p = neat.Population(config)

# Add a stdout reporter to show progress in the terminal.
p.add_reporter(neat.StdOutReporter(True))
stats = neat.StatisticsReporter()
p.add_reporter(stats)

# Run until a solution is found.
winner = p.run(eval_genomes) 

# Display the winning genome.
print('\nBest genome:\n{!s}'.format(winner))

# Show output of the most fit genome against training data.
print('\nOutput:')
winner_net = neat.nn.FeedForwardNetwork.create(winner, config)
count = 0

#To make predictions using the best genome
for xi, xo in zip(xor_inputs2, xor_outputs2):
  prediction = winner_net.activate(xi)
  print("  input {!r}, expected output {!r}, got {!r}".format(
      xi, xo[0], round(prediction[0])))
  #to get prediction accuracy
  if int(xo[0]) == int(round(prediction[0])):
    count = count + 1
accuracy = count / len(xor_outputs2)
print('\nAccuracy: ', accuracy)

node_names = {-1: 'A', -2: 'B', 0: 'A XOR B'}
visualize.draw_net(config, winner, True, node_names=node_names)
visualize.plot_stats(stats, ylog=False, view=True)
visualize.plot_species(stats, view=True)

Config file is:

#--- parameters for the XOR-2 experiment ---#

[NEAT]
fitness_criterion     = max
fitness_threshold     = 4.8
pop_size              = 150
reset_on_extinction   = True

[DefaultGenome]
# node activation options
activation_default      = sigmoid
activation_mutate_rate  = 0.0
activation_options      = sigmoid

# node aggregation options
aggregation_default     = sum
aggregation_mutate_rate = 0.0
aggregation_options     = sum

# node bias options
bias_init_mean          = 0.0
bias_init_stdev         = 1.0
bias_max_value          = 30.0
bias_min_value          = -30.0
bias_mutate_power       = 0.5
bias_mutate_rate        = 0.7
bias_replace_rate       = 0.1

# genome compatibility options
compatibility_disjoint_coefficient = 1.0
compatibility_weight_coefficient   = 0.5

# connection add/remove rates
conn_add_prob           = 0.5
conn_delete_prob        = 0.5

# connection enable options
enabled_default         = True
enabled_mutate_rate     = 0.01

feed_forward            = True
initial_connection      = full_direct

# node add/remove rates
node_add_prob           = 0.2
node_delete_prob        = 0.2

# network parameters
num_hidden              = 0
num_inputs              = 3
num_outputs             = 1

# node response options
response_init_mean      = 1.0
response_init_stdev     = 0.0
response_max_value      = 30.0
response_min_value      = -30.0
response_mutate_power   = 0.0
response_mutate_rate    = 0.0
response_replace_rate   = 0.0

# connection weight options
weight_init_mean        = 0.0
weight_init_stdev       = 1.0
weight_max_value        = 30
weight_min_value        = -30
weight_mutate_power     = 0.5
weight_mutate_rate      = 0.8
weight_replace_rate     = 0.1

[DefaultSpeciesSet]
compatibility_threshold = 3.0

[DefaultStagnation]
species_fitness_func = max
max_stagnation       = 20
species_elitism      = 2

[DefaultReproduction]
elitism            = 2
survival_threshold = 0.2

However, the issue here is that no retraining takes place after each prediction is made in the test set. I believe the parameters in the config file are static and cannot change after the training process beginnings, so if you fitness level is based on the number of correct classifications of the training set (which is what I'm trying to implement, very similar to the one used here) this will be a problem and so I would like to understand whether a model that retrains can be implemented through adjusting a setting in the config file. Or is there more to it then that?