Several Questions - Githubissues

tutorexchange commented 5 years ago

Hi! The paper is really awesome. Just a few questions: -1. What variant of NEAT was used in the paper? Is it just the original NEAT without optimizing weights?What other optimizations to the algorithm has been implemented to suit this task of evolving weight agnostic Neuro-Networks? -2. How long did it take to train the network to become sufficient enough on MNIST? At what point did you decide to stop the network from evolving? Would it potentially have became better. -3. How important really are changing activations to these networks? It seems like that the topology is the most important here, and also biologically. -4. Would you guys be potentially releasing any code?

Thanks alot! C

wangii commented 5 years ago

agree, the changing of activation really annoying. seems there should be a universal activation function if biology is taken literally.

agaier commented 5 years ago

Thanks!

-1. What variant of NEAT was used in the paper? Is it just the original NEAT without optimizing weights?What other optimizations to the algorithm has been implemented to suit this task of evolving weight agnostic Neuro-Networks?

The original implementation of NEAT was used with a few modifications:

Rather than judging fitness based on a real-valued scalar, it instead is based on a ranking. This ranking is based on pareto-dominance between performance and number of connections. We use the non-dominated sort + crowding distance of NSGA-II for this.
ANNs were replaced with CPPNs, so we are actually doing CPPN-NEAT (note that this is not HyperNEAT).
We did not use speciation in the experiments. There is nothing stopping us from using it (and it could easily be turned back on in the code) but for the sake of clarity and simplicity in the publication we left it out.

-2. How long did it take to train the network to become sufficient enough on MNIST? At what point did you decide to stop the network from evolving? Would it potentially have became better.

The MNIST experiment was more 'for fun' at the end just to see if it would work. We didn't do a lot of optimization of hyperparameters or setup, we just put it on a machine with a huge population and just let it run for a few days while we did other things. By the time we ended the run it had more or less converged. If we spent more time optimizing hyperparameters or putting in some other tricks specifically targeting MNIST I'm sure we could get better accuracy, but at this point we are still just wondering what is possible, not trying to hit SOTA.

-3. How important really are changing activations to these networks? It seems like that the topology is the most important here, and also biologically.

We didn't do much experimentation, but my intuition is that the variety of activations is key. That is not to say that all of them are necessary, but I'm not confident this could have been accomplished with only linear activations. As for biological corollaries, I'm not going to claim that a cosine activation is an accurate model of a how neurons fire -- but I don't think a feed forward network of sigmoids wouldn't be any more biologically plausible.

-4. Would you guys be potentially releasing any code?

Yes! There is a bit of a code review process to get things out of google, but then I will release all the code used to do these experiments (as well as champion networks, etc) as well as the original NEAT code it was based on. I hope people play with it!

weightagnostic / weightagnostic.github.io

Several Questions #2