wagenaartje / neataptic

:rocket: Blazing fast neuro-evolution & backpropagation for the browser and Node.js
https://wagenaartje.github.io/neataptic/
Other
1.18k stars 278 forks source link

calling the included NEAT implementation NEAT is misleading #94

Open micimize opened 6 years ago

micimize commented 6 years ago

Speciation via some distance measurement is a fundamental aspect of the NEAT algorithm. I suggest clarifying in the docs that the included implementation diverges from the original specification, and perhaps renaming it to something like NEATSS (NEAT Sans Speciation)

wagenaartje commented 6 years ago

I completely understand. The way it works is quite different to the original NEAT algorithm. I actually call it the 'Instict' algorithm (more about that can be read there). I'm leaving this open until I have time to modify the docs.

pjbaron commented 3 years ago

I've been playing with this over a couple of evenings and haven't devoted much time to it so apologies if I have misunderstood or simply missed something important. I don't see anything in the article about Instinct which will prevent premature convergence. This was always one of the more powerful features of NEAT, and it relies on the use of innovation numbers to create species before cross-over is applied. I see you addressed innovation numbers as being computationally expensive, but I can't find anything in Instinct which replaces their functionality with an equivalent method of maintaining diversity (mutation is not a good alternative, and enabling elitism makes this a particularly urgent problem). Is there a system to avoid/reduce population convergence?

wagenaartje commented 3 years ago

@pjbaron

It's a long time since I have read the paper, but I do not remember how innovation numbers maintain diversity. What I do recall is that innovation numbers are designed for the purpose of easy offspring generation, but also to keep track of how genomes evolve over time. At the time, I did not care much about the latter and was really thinking in terms of optimization. I believe that the (initial version of) the code implemented the same algorithm for cross-over, but without using these innovation numbers. To be more precise, there seemed to be an easy way to see which genes are identical without keeping track of their history.

TL;DR it seemed like innovation numbers were purely for the purpose of bookkeeping, but aren't required for the functionality.

However, I could be wrong. If you have a certain paragraph from the paper which shows how it prevents premature convergence, please post it here and I'll have a better look!

pjbaron commented 3 years ago

Hi @wagenaartje I'm delighted to see that you're still active after all this time. First, I want to thank you for making an implementation in open source, it is wonderful to be able to download full solutions and play with them easily!

Apologies for the wall of text reply. I've carefully reread the paper I cite below and realised that my assumptions are related to a number of separate quotes which tie innovation numbers to historical markings and speciation, then finally link this to the algorithm's exceptional performance.

I refer to http://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf

"Protecting Innovation with Speciation:" ... "Unfortunately, because of the initial loss of fitness caused by the new structure, the innovation is unlikely to survive in the population long enough to be optimized. Thus, it is necessary to somehow protect networks with structural innovations so they have a chance to make use of their new structure."

Innovation numbers: "The innovation numbers are historical markers that identify the original historical ancestor of each gene."

Purpose: "an innovation number, which allows finding corresponding genes".

Meaning: "The innovation numbers ... represent a chronology of the appearance of every gene in the system."

Evolution and history: "whenever these genomes mate, the offspring will inherit the same innovation numbers on each gene; innovation numbers are never changed. Thus, the historical origin of every gene in the system is known throughout evolution."

Implementation and actual innovation: "The system now knows exactly which genes match up with which (Figure 4). When crossing over, the genes in both genomes with the same innovation numbers are lined up. [...] They represent structure that is not present in the other genome."

In the analysis section: "The reason why we do not ablate historical markings directly is that without historical markings the system would be a conventional NE system. Historical markings are the basis of every function in NEAT: Speciation uses a compatibility operator that is based on historical markings, and crossover would not be possible without them." (italics are mine)

I hope I have made clear why I believe that innovation numbers are central to the entire NEAT algorithm, and how they are used within the system to maintain speciation and thereby prevent premature convergence.

wagenaartje commented 3 years ago

Thanks! Again, my knowledge on the library / algorithm is a bit stale, so there might be some incorrect facts in the piece below.

I completely forgot to mention, the neataptic implementation does actually use innovation numbers. However, the difference is that I do not make these numbers explicit. As you may see in the original paper, the innovation numbers are unique to each pair of nodes: the connection from node 1 to 3 has innovation number x, the connection from node 5 to 8 always has innovation number y, etc. Connections that were mutated earlier in the process have a lower innovation number. So an innovation number tells you two things:

So as I said above, I did not really care about the order of appearing genes, as this used nowhere in the algorithm. However, as you say correctly, the former point is used in the crossover algorithm. So instead of storing the innovation number explicitely, I calculate it during every crossover operation because it is implicitely present in the connection arrays that are stored for each genome. Every connection goes from a node a to a node b, so I simply constructed a function that returns a unique number from node indices a and b. (How? Well there are functions that map pairs of numbers to a unique number).

However, the crossover process should be exactly the same: I still determine which genes are common, which are disjoint and which are excess, and then select them based on the same rules mentioned in the paper. You can check this part of the code here. The determination of the type is done using the dynamic innovation ID's, calculated here.


TL;DR I used innovation numbers, but I don't sture them and simply calculate them on the run since they are essentially mappings from pairs of nodes to a unique number.


If you still think some functionality is missing because of how I calculate the IDs, let me know. I was quite young at the time, so maybe I overlooked some other important characteristics.

pjbaron commented 3 years ago

Awesome! Thanks for the comprehensive replies. I will take a look at the implementation of implicit innovation numbers and try to confirm whether it achieves the same goals. I did find the mapping function when I was poking around - I had to check the wikipedia link, it's a very useful technique which I wasn't aware of previously.