Brain + Genome collaboration

andyleejordan commented 9 years ago

@thom5468 and teammate, if I understood you right, you're programming a neural network and you want to encode actions as traits.

We'll need a fixed list of actions the creature can take, and a mapping from the action's trait value and (I'll call it) trigger value from the NN. Totally doable.

thom5468 commented 9 years ago

The actions are not traits. Actions are the observable abilities the creature posses. i.e. the ability to eat or walk forward. Each index of the Actions array is associated with a different ability. The action array is the product of multiplying the Senses vector by our ANN matrix.

What I do need from the genome is enough values to fill out a 2D array in the ballpark of 10 X 1000. This is the initial size of our ANN. We will use direct mapping of your genes to our network. I have been advised by @tsoule88 that most values will start out at zero and through variation will become positive and negative.

andyleejordan commented 9 years ago

So you want 10,000 genes under a Brain trait? It'd be something like this in Genome:

class Genome {
  ...
  Trait brain = new Trait(10 * 1000);
  ...
}

// returns a FloatList of length 2 (diploid) * 10 * 1000
creature.genome.brain.list()

// test performance
void update() {
  ...
  println(genome.brain.avg());
  ...
}

After testing, this did not seem to have a significantly adverse affect on performance. I imagine computing a value on each update is on the same order of what you'll be doing with the values.

However, if you want a 2D array, I could give you a list of say brain[10] traits, each with 1000 genes, which would map better.

class Genome {
  Trait[] brains = new Trait[10];
  {
    for (int i = 0; i < 10; i++) {
      brains[i] = new Trait(1000);
    }
  }
}

class Creature {
...
  void update() {
  ...
    for (int i = 0; i < 10; i++) {
      println(genome.brains[i].avg());
    }
  ...
  }
}

But the performance of the game suffers significantly (i.e. not playable). But in this case I do not believe my test is anywhere near the order of what you'll be doing.

Note: the performance penalty was about the same when using an ArrayList<Trait> over the C-style array.

Either way, simply adding the genes you required did not cause any problems, so pick one and we'll figure out performance when we know what we'll be doing with all the values.

thom5468 commented 9 years ago

I think the first case is better because it will be easier to expand upon later on (if need be).

I have considered the negative performance incurred when this entire process is implemented. We expect to compute behavior every time unit of the game. If the game starts with 20 creatures then the max number of multiplications is 20 * 1500 * 10,000 = 3E8. This is on top of drawing graphics, calculating the the senses vector and other game processes going on. I think threading is a must.

andyleejordan commented 9 years ago

Out of curiosity, what number of inputs and outputs are you potentially going to have?

It's been a while since I've used neural networks (and I didn't do a whole lot with them then either), so forgive me with being unfamiliar. Is the 1,000 factor the number of nodes in your hidden layer?

andyleejordan commented 9 years ago

Ah ha, you're Team Krang and the school VPN gave me access to the wiki. Okay I'm getting up to speed on your goals.

tsoule88 commented 9 years ago

In the example above where performance was significantly worse, part of the issue was the inclusion of the println instruction. As far as I can tell println (and possible print as well) causes the thread to briefly halt and return control to the OS. Not very slow in itself, but a killer if its done thousands of times per step.

The neural network size may be a little large, but there needs to be an input for every piece of sensed data. if the sensing team wants 10 sensing points, each of which is smelling/tasting, 5 different "chemicals", that's 50 inputs right there. Plus, touch on each segment, current health and energy, age, etc. and a bias input. So, we rounded up to 100 inputs. For outputs there's: force, torque, whether to produce gametes, maybe some different movement options (hop, fly, etc.) and maybe some other outputs. So, again we rounded up to 10 outputs. This requires 1000 (100*10) weights per creature.

andyleejordan commented 9 years ago

Yeah, I figured println would give a decent stress-test for an upper-bound.

Awesome @tsoule88, thank you for the math. My estimations of inputs was much lower, your explanation is very helpful. So one hidden layer with 100 inputs and 10 outputs. Sounds reasonable.

@thom5468 what activation function are you considering? A sigmoid (like tanh) is usually used but from what I understand it's not the best biological model, but I mean that's a deep rabbit hole to go down.

andyleejordan commented 9 years ago

Ack, I'm rusty on ANNs. With input*output number of weights there is no hidden layer. It's just a single-layer feedforward network. My bad.

thom5468 commented 9 years ago

At this time we do not have an activation function and I don't think one will be implemented. I believe the purpose of an activation function is to scale the output or to translate it to binary signals. With some outputs the magnitude will be meaningful (i.e. force and torque) and so an activation function probably won't work well for this model.

@andschwa How do you feel about setting aside an int in your genome class that I can update when the output vector changes in size and then doing the same for sensory systems? You can allocate your genes for the brain based on those two values. That way we don't have to update each other each time a minor change is made.

andyleejordan commented 9 years ago

To have an ANN you'll need some sort of activation function; I don't think it's doable without. You're right in that you may need to have a few depending on the particular output.

@thom5468 It's already there as static final int BRAIN_OUTPUTS, and BRAIN_INPUTS for the sensory systems. See here.

thom5468 commented 9 years ago

Great! When changes are made to the size of the output vector, I will just go there.

I wouldn't mind if you elaborated on why an activation function is required. Activation functions were one of the first topics I encountered during my research for this project and I have not given them much consideration since the first couple weeks. Please advise.

andyleejordan commented 9 years ago

Full disclosure: I am really rusty on ANNs and have just done a cursory catch-up via Wikipedia and an old textbook.

From what I understand, the way an ANN works is that you have a set of nodes (our inputs and outputs), each connected by directed links. A link from node i to node j propagates the activation a_i from i to j, and the link has a weight w_i,j. It's the weights that we're encoding in the genome.

So when the ANN is evaluated, each unit j (an output in our case) computes the weighted sum of its inputs (the dot product of the weights and the inputs), and applies an activation function to this weighted sum to calculate the final output.

In multi-layered networks, the general case is that the inputs to one layer are the values of the activation functions of the previous layer; but with one layer, you use activation function only once (well, once per output), to get the value of the output nodes.

There are a lot of choices of activation functions. If a hard threshold is used, the neurons become known as perceptrons, or if a logistic function is used, they're sigmoid perceptrons. I do not know which would be more applicable here, or how you would decide which action to take based on the outputs you do get (perhaps each output is an action, and the action associated with the maximal output value is what is immediately taken).

The other big thing to take into account is that we won't be training the network. I think instead we're doing something akin to neuroevolution.

See Artificial Intelligence: A Modern Approach by Russell and Norvig, section 18.7.

tsoule88 commented 9 years ago

Andrew is pretty much correct. An activation function is basically a rescaling function for the output of an artificial neuron. If a neuron has 100 inputs, each of which is multiplied by some weight w_i, and then summed together, the result could be a very large number (or negative number). The activation function simply rescales it, usually to be within a range from -1 to 1 or 0 to 1, so that you don't produce huge numbers that through off the rest of the neural network's calculation.

tsoule88 / evolvedTD

Brain + Genome collaboration #31