Closed yamatakeru closed 2 years ago
Hi,
Stanley et al states: "By convention, a connection is not expressed if the magnitude of its weight, which may be positive or negative, is below a minimal threshold wmin" - this threshold is defined as 0.2, both negative or positive, 'stronger' weights are multiplied by the maximum magnitude (defaulted to 5.0) as specified by Stanley et al. Why do you think the range [-0.2, 0.2] should be included?
Thank you for your reply :-)
Sorry, I didn't explain it well enough. I do not disagree that "By convention, a connection is not expressed if the magnitude of its weight, which may be positive or negative, is below a minimal threshold wmin".
My suggestion is that the expressed connection weights above the threshold should be scaled to be between zero and a maximum magnitude. In the current implementation, we cannot define a connection of weight whose magnitude is below wmin.
I see your point now - and it's valid :) I'll merge your PR if you create one, scaling the weight properly.
Actually.. isn't the function returning totally wrong weights? As the absolute value of the weight isn't being used when multiplied by max_weight. I mean, Stanley et al states "... scaled to be between zero and a maximum magnitude ..." - in our implementation it's scaled between [-max_weight, max_weight] with a dead space in the middle which you found.
I think that magnitude in "... scaled to be between zero and a maximum magnitude ..." denotes the distance from 0.0 (absolute value). In other words, it is maybe that the correct range of returned weights is [-max_weight, max_weight].
As a practical matter, Stanley's HyperNEAT users page (http://eplex.cs.ucf.edu/hyperNEATpage/) says the expression of connection weights as follows.
In HyperNEAT it is conventional not to express a connection whose weight magnitude (output by the CPPN) is below some threshold. For example, the threshold might be 0.2, which would mean no connection is expressed with a weight between [-0.2..0.2]. For any connection that is above this magnitude (and therefore expressed), its weight is scaled to a range. For example, a reasonable range is [-3..3]. The question is why this cutting and scaling are done.
I see your point. It's just a weird way to express it. Anyway - I'm up for changing it to a continuous weight distribution from [-magnitude, magnitude]!
I have just created a PR :-) Could you merge the PR if this Implementation is no problem?
Thanks! Will the incoming w always be between 0 and 1, since you're dividing by 0.8?
Will the incoming w always be between 0 and 1, since you're dividing by 0.8?
Yes, that's right!
...But, I just realized, It has a condition that the range of the output of the node's activation function doesn't exceed [-1.0, 1.0]. For example, w is not always be between 0 and 1 when the node's function mutated for identity, abs and so on.
In addition, the HyperNeat user's page (http://eplex.cs.ucf.edu/hyperNEATpage/) says the following:
However, that leaves the output ranges [0.2..1] and [-0.2..-1] as expressed weights, because the CPPN only outputs numbers between -1 and 1.
Is the condition always met in the current implementation?
Uhm, I wouldn't say so. It outputs between [-5, 5] or whatever max_weight is set to. But your quoted sentence is just before this: "So we renormalize the range to [-3..3].", where the range [-3..3] is just arbitrarily mentioned as a good range to use. So it should not only output between [-1, 1]
Sorry, there was very lack of explanation :-( My concern is that an absolute value of CPPN weight may exceed max_weight by an absolute value of CPPN output becoming over 1.0 when the CPPN output node has abs, liner and so no as activation function. (Where I am distinguishing between CPPN output range and CPPN weight range as in the expression on table 1 on page 36 of http://axon.cs.byu.edu/Dan/778/papers/NeuroEvolution/stanley3**.pdf.)
If you've got an idea on how to make sure it happens, please just leave a PR - I'll merge it.
I have two ideas. However, I don't know how to improve the problem with neat-python's basic functions, so these ideas may not be the best way. If below solutions are no problem, I can make a PR from either idea in no time.
One of the ideas is to provide a function, which is based on neat-python's nn.FeedForwardNetwork.create(), to create cppn as follows. This function changes the gene of the output node to represent an arbitrary activation function.
import neat
from neat.graphs import feed_forward_layers
def create_cppn(genome, config, output_activation_function="tanh"):
""" Receives a genome and returns its phenotype (a FeedForwardNetwork). """
# Gather expressed connections.
connections = [cg.key for cg in genome.connections.values() if cg.enabled]
layers = feed_forward_layers(config.genome_config.input_keys, config.genome_config.output_keys, connections)
node_evals = []
for layer in layers:
for node in layer:
inputs = []
node_expr = [] # currently unused
for conn_key in connections:
inode, onode = conn_key
if onode == node:
cg = genome.connections[conn_key]
inputs.append((inode, cg.weight))
node_expr.append("v[{}] * {:.7e}".format(inode, cg.weight))
ng = genome.nodes[node]
aggregation_function = config.genome_config.aggregation_function_defs.get(ng.aggregation)
### The additional part is here. ###
# Fix the output note's activation function to any function.
if node in config.genome_config.output_keys:
ng.activation = output_activation_function
###########
activation_function = config.genome_config.activation_defs.get(ng.activation)
node_evals.append((node, activation_function, aggregation_function, ng.bias, ng.response, inputs))
return neat.nn.FeedForwardNetwork(config.genome_config.input_keys, config.genome_config.output_keys, node_evals)
And, we may consider mapping unfavorable function to unfavorable function when we want to evolve the output node's activation function as follows.
import neat
from neat.graphs import feed_forward_layers
# Define the default mapping of functions.
default_output_func_map = \
{"sigmoid": "tanh",
"tanh": "tanh",
"sin": "tanh",
"gauss": "tanh",
"relu": "tanh",
"elu": "tanh",
"lelu": "tanh",
"selu": "tanh",
"softplus": "tanh",
"identity": "tanh",
"clamped": "tanh",
"inv": "tanh",
"log": "tanh",
"exp": "tanh",
"abs": "tanh",
"hat": "tanh",
"square": "tanh",
"cube": "tanh"}
def create_cppn(genome, config, output_func_map=default_output_func_map):
:
:
# Map the output note's activation function to any function.
if node in config.genome_config.output_keys:
ng.activation = output_func_map[ng.activation]
activation_function = config.genome_config.activation_defs.get(ng.activation)
:
:
This has been merged, right? And is working?
I've checked and think it's working :-)
Hi,
I have a bit of improvement point about the query_cppn function in hyperneat.py. In line 85-88, a value below the threshold is replaced with 0.0, so that range [-0.2, 0.2] of the value drop out in this implementation.
However, the original paper (http://axon.cs.byu.edu/Dan/778/papers/NeuroEvolution/stanley3**.pdf) says "The magnitude of weights above this threshold are scaled to be between zero and a maximum magnitude in the substrate." on page 8.
Thus, I suggest changing the query_cppn function like it returns a value of continuity range [-max_val, max_val].