CodeReclaimers / neat-python

Python implementation of the NEAT neuroevolution algorithm
BSD 3-Clause "New" or "Revised" License
1.4k stars 488 forks source link

Different kinds of outputs #247

Open erupturatis opened 1 year ago

erupturatis commented 1 year ago

Hey,

Generally the action space in a game or environment more generally can be represented in a lot of ways (for example one-hot encoded inputs, probabilities, values between 0 and 1, etc). This is also a bit related to #184 I think we should be able to make a more detailed setup in the config files for different types of outputs such as

[LanderGenome]

... num_outputs = 20 Custom_outputs = True Softmaxed_outputs = [3, 2] Clamped_outputs = [(3,1,2),(2,0,1)] one_hot_encoded = [5] normal_outputs = [5]

Custom outputs would signal neat to return outputs in another form then usual (I was thinking about a dictionary of outputs depending on the customization)

Softmaxed_outputs = [3, 2] # 1 group of 3 softmaxed outputs and another of 2 softmaxed outputs Clamped_outputs = [(3,1,2),(2,0,1)] # a group of 3 clamped outputs between 1 and 2 and another group of 2 clamped outputs between 0 and 1 one_hot_encoded = [5] , 5one hot encoded values normal_outputs = [5] the last 5 outputs will have raw data in them The output matrix for what I am describing should look like this:

outputs = { "softmax": [[0.2, 0.4, 0.4], [0.45, 0.55]], # the 2 pair of one hot encoded values "clamped":[[1,2,1], [1,0]], "encoded":[[0,0,0,1,0]], "nomal": [12.1,32.1,43.1,1.1,2.3] }

The total number of custom outputs should be equal to the total number of outputs

Also I am not sure how this would affect the performance of the networks. For example if the output of a network would be mostly over 10 you can't directly clamp the values between 0 and 1 since they would all be 1 so we would need to normalize some of the special values first like the clamped ones

Please let me know what you think