ctn-archive / nengo_theano

ABANDONED; see https://github.com/nengo/nengo instead
MIT License
3 stars 3 forks source link

Subnetworks #4

Closed studywolf closed 11 years ago

studywolf commented 11 years ago

Terry has a good idea for subnetworks as just network objects with a prefix to all relative 'get' calls, I think it's a good idea! Keeps the whole network flat and it's just a naming convention for grouping and our benefit, and solves problems of not being able to connect between things in different subnetworks.

jaberg commented 11 years ago

Can neurons have multiple names?

In other words, can networks overlap? (And should they, if they can't?)

On Tue, Mar 12, 2013 at 12:21 PM, studywolf notifications@github.comwrote:

Terry has a good idea for subnetworks as just network objects with a prefix to all relative 'get' calls, I think it's a good idea! Keeps the whole network flat and it's just a naming convention for grouping and our benefit, and solves problems of not being able to connect between things in different subnetworks.

— Reply to this email directly or view it on GitHubhttps://github.com/ctn-waterloo/nef-py/issues/4 .

studywolf commented 11 years ago

In Nengo there's no multiple names for any ensembles, and no overlap between networks. Is there a case for having multiple names or multiple networks claiming the same ensemble? Seems like it might quickly lead to confusing scripts...

tbekolay commented 11 years ago

Yeah, having a neuron be a member of multiple ensembles, and have ensembles be a member of multiple networks could make sense for some things. It's not really a case that we've considered before, though, so it probably warrants some discussion as to the implications of that.

studywolf commented 11 years ago

Hmm, what are some examples? I can't think of any where it makes sense to share across subnetworks rather than having the shared object up a level and both subnetworks projecting out to it.

tbekolay commented 11 years ago

That's true, I guess with subnetworks and the ability to make hierarchies you wouldn't need or want multiple membership. The fact that we haven't needed this up until now probably indicates that it isn't necessary...

drasmuss commented 11 years ago

There are benefits to having an actual encapsulated network object though, rather than just a naming convention. Something to pass around to different functions, something that you can attach methods to, something that you can initialize nicely with a constructor. I'm not sure it's worth getting rid of all that, just to solve the problem of connecting between subnetworks (which should actually be easy enough to do anyway if we're redoing network/connect code, it was only difficult in Nengo because it wasn't designed with that in mind).

studywolf commented 11 years ago

Is there anything we need to initialize in subnetworks though? It's really just a means for us to group things...

tbekolay commented 11 years ago

I think Dan's referring to the reusability of subnetworks here; you should be able to give someone a script that constructs a network, and they should be able to import it and use that imported network in their own network as a subnetwork. Then, you should be able to call a function and give it that subnetwork as a parameter. It might be possible to do those things with this new way of doing subnetworks.

studywolf commented 11 years ago

I see, so the benefit is being able to easily refer to the subnetwork as an single object rather than scan all the names of the flat network whenever you refer to it?

jaberg commented 11 years ago

I was just bringing it up because it's a standard pattern: you have a container with things in it, then you want to group them, so it's natural to do it recursively and you get a good-old-fashioned directory structure type thing. Then you want to group things by a different criterion, and you end up wanting a more general selection mechanism (e.g. "find me all neurons that almost never fire" "find me all neurons with fewer than N downstream connections" "set all really weak synapse strengths to 0" "get me a random sub-population of the following subnetworks and connect them to this new population")

If you can do all that stuff, then being able to group neurons by having some parent group tag is trivial, and also happens to implement the hierarchical organization.

Thinking about it from the perspective of wanting to simulate things, is there currently a data structure that is just a linear array of all the neurons? For example, are all the membrane potentials of all the neurons currently in one big long numpy vector? If not, maybe they should be, because it makes the simulation code simpler and faster, and it makes all of the "advanced" queries I mentioned above take the form of

  1. get a list of integers of all matching neurons
  2. do something at those positions.

I would really suggest trying to think along the lines of associating logical groupings (e.g. subpopulations, neuron types, etc.) with lists of integer neuron positions that can be used to access the numerical buffers... and not putting the numerical buffers into the data structures used for logical groupings.

On Wed, Mar 13, 2013 at 11:02 AM, Trevor Bekolay notifications@github.comwrote:

I think Dan's referring to the reusability of subnetworks here; you should be able to give someone a script that constructs a network, and they should be able to import it and use that imported network in their own network as a subnetwork. Then, you should be able to call a function and give it that subnetwork as a parameter. It might be possible to do those things with this new way of doing subnetworks.

— Reply to this email directly or view it on GitHubhttps://github.com/ctn-waterloo/nef-py/issues/4#issuecomment-14846177 .

drasmuss commented 11 years ago

Like Trevor says, basically the idea is that subnetworks are objects (in the object oriented programming sense). I would say that most of the reasons that it is good to encapsulate a related collection of data/functions within a single object apply to subnetworks as well. Off the top of my head, here's some things that network objects make possible: -network specific parameters -network methods -passing networks as arguments to other functions -encapsulation (it may not always be desirable to have every component of the network visible)

It may be possible to recreate these things in a "flat" network with some Python namespace magic, but it seems like it would be a lot more work than just using the built in capabilities of python objects.

jaberg commented 11 years ago

... when it comes to running the simulator, you want all the numbers to be neatly lined up, you don't want a little buffer of 100 neurons here, and a little buffer of 100 neurons there, running all kinds of little loops to update memory all over the place in little chunks. Better: line up all 2.5 million of spaun's membrane potentials in one vector of float32s.

On Wed, Mar 13, 2013 at 11:29 AM, James Bergstra james.bergstra@gmail.comwrote:

I was just bringing it up because it's a standard pattern: you have a container with things in it, then you want to group them, so it's natural to do it recursively and you get a good-old-fashioned directory structure type thing. Then you want to group things by a different criterion, and you end up wanting a more general selection mechanism (e.g. "find me all neurons that almost never fire" "find me all neurons with fewer than N downstream connections" "set all really weak synapse strengths to 0" "get me a random sub-population of the following subnetworks and connect them to this new population")

If you can do all that stuff, then being able to group neurons by having some parent group tag is trivial, and also happens to implement the hierarchical organization.

Thinking about it from the perspective of wanting to simulate things, is there currently a data structure that is just a linear array of all the neurons? For example, are all the membrane potentials of all the neurons currently in one big long numpy vector? If not, maybe they should be, because it makes the simulation code simpler and faster, and it makes all of the "advanced" queries I mentioned above take the form of

  1. get a list of integers of all matching neurons
  2. do something at those positions.

I would really suggest trying to think along the lines of associating logical groupings (e.g. subpopulations, neuron types, etc.) with lists of integer neuron positions that can be used to access the numerical buffers... and not putting the numerical buffers into the data structures used for logical groupings.

On Wed, Mar 13, 2013 at 11:02 AM, Trevor Bekolay <notifications@github.com

wrote:

I think Dan's referring to the reusability of subnetworks here; you should be able to give someone a script that constructs a network, and they should be able to import it and use that imported network in their own network as a subnetwork. Then, you should be able to call a function and give it that subnetwork as a parameter. It might be possible to do those things with this new way of doing subnetworks.

— Reply to this email directly or view it on GitHubhttps://github.com/ctn-waterloo/nef-py/issues/4#issuecomment-14846177 .

drasmuss commented 11 years ago

I think maybe we're talking about two different things here. I'm thinking about how we want things organized from the Python API perspective, whereas I think what you're discussing is how to get things nice and optimized in the Theano backend. I'm not sure which one we're supposed to be deciding on here.

jaberg commented 11 years ago

Right, so I guess the design choice I'm challenging is that whereas Ensembles currently "own" their own shared variables, it might be better if they "owned" instead some pointers into a global brain model where their neurons reside. This would bring two advantages:

  1. the simulator code would generally be faster
  2. ensembles could overlap not at all, partially, or completely (as in a hierarchy)

To make this happen, a new data structure would be necessary, to actually own the memory associated with all of the neurons. There would be one of these objects per neuron type, that's in charge of updating all of the e.g. LIF neurons in an entire model. Assuming an ensemble has only a single neuron type, then an ensemble would have a pointer to the object containing its neurons, and some indexing information (e.g. a range or specific indexes, whatever numpy indexing can handle).

So the api of current ensembles is all fine, it's just that when it comes time to generate the update equations for the entire model, there is a lot of opportunity for simplification by vectorization, because all of the neuron dynamics are handled by the one LIF neuron container instead of all of the ensembles of LIF neurons. In other words, the theano graph would be a lot smaller, and the individual computation nodes would be able to take full advantage of the GPU's thousands of cores.

jaberg commented 11 years ago

Carrying on this thought - the "new data structure" I mentioned can actually be an Ensemble. I'm just suggesting a slightly different style of putting them together:

  1. make it possible for an Ensemble to index into one or more other Ensembles instead of owning its own memory
  2. scripts pre-allocate one Ensemble for each neuron type in a model
  3. Network.make extends one of the pre-allocated ensembles, and then creates a new Ensemble aliased to those neurons (a "view" on them)
  4. Connections are still made between these aliases.

This change also brings things more in line with PyNN's way of doing things, because PyNN's Population objects can be views on each other.

drasmuss commented 11 years ago

That makes a lot of sense to me, that's always been one of the downsides of the Java implementation, that there are so many duplicate copies of various bits of memory sitting around in different places. And it seems like that would still be compatible with having network objects (since they're just a shell around the ensembles, they don't really have any data of their own other than some parameters), so I like that.

tcstewar commented 11 years ago

Sounds like we're all pretty much on this. We need something like subnetworks at the API level, because they're very useful for design. For example, I might have a function that builds a basal ganglia given a network, and I'd like to be able to do something like:

net = nef.Network('My main network')
sub = net.make_subnetwork('BG')
create_basal_ganglia(sub)

This sort of idea works right now in the Nengo API http://www.nengo.ca/docs/html/nef.Network.html#nef.Network.make_subnetwork, and it's definitely something that should be added to the Theano version

But there's also the separate question of how to implement this efficiently in Theano. I'm pretty sure I know a really simple way to do it, too, which I'll take a stab at today. Instead of having this complex nested structure of networks, we can actually just store it as one giant network, and use proxy objects for the subnetworks. So when I do

sub = net.make_subnetwork('BG')

it would basically return exactly the same network as 'net', but modified so that it will automatically prepend 'BG.' in front of all make(), make_input(), connect(), etc calls. I believe that gives us exactly the functionality we have on the Nengo side (for example, being able to do net.connect('BG.GPi', 'Thalamus') ), without any mucking about with exposing origins and terminations.

As to James' more extended point, though, right now we do keep all ensembles separate (well, more technically, every ensemble is actually a NetworkArray, so at least any NetworkArrays get entirely shared memory, with matricies all nicely lined up), but we're not doing any sharing across ensembles. So, for instance, if we do:

net.make('A', 100, 1)
net.make('B', 100, 1)

then we might be able to get a bit of a speedup by at least having those encoders/decoders/voltage/etc for those ensembles all being in the same matrix (even though they have different encoders, so it's not like we're reducing memory usage or anything).

Right now, if I just do

net.make_array('AB', 100, 2)

then I do get a network array that has that shared matrix capability, except that right now those sub-ensembles all get exactly the same encoders, decoders, alphas, and biases. That's on the list of things to get fixed.

Right now, my feeling is to not worry too much about the optimizing of combining ensembles together internally when appropriate, but it'd be worth finding out what sort of speedups we might be able to get (I guess we can quickly compare building something with an array and without an array and see how much of a difference there is....)

hunse commented 11 years ago

The amount of speedup is going to depend on a lot of different factors, including size of the network, number of ensembles, and whether the CPU or GPU is used. The biggest speedup will likely be for large networks with lots of ensembles on the GPU. I think any speedups on the CPU will be modest.

tbekolay commented 11 years ago

My two cents, having the top-level network contain all of the actual data, and having all other things (Ensembles, subnetworks) be views into that network makes the most sense to me. It also means that we might be able to get a not-so-terribly slow pure Python implementation through functional abstractions (i.e., map) instead of slow Python for loops.

It might also mean that we can share a lot more between the Theano and non-Theano implementations by having the ensemble, subnetwork, etc. logic just be focused on keeping track of metadata and indexes into the top-level network stuff, then each backend only has to implement the top-level network.

Slight aside: if PyNN uses the Population terminology, should we adopt it as well, over Ensemble?

jaberg commented 11 years ago

I'm thinking this through a little further... It's helpful to look at the computation in terms of communication patterns and sequential barriers. A normal weight-matrix simulation has essentially two computational phases:

  1. neuron dynamics (huge elementwise computation)
  2. synapse updates (can be done independently, but requires lots of reads per write)

A nef-style simulation has three phases

  1. neuron dynamics
  2. decoder
  3. encoder

There is probably enough parallelism within each of these phases to keep a GPU or multicore processor busy, so we should be aiming for a theano graph that has roughly 2 or three bottleneck Ops: one per phase.

For neuron dynamics, we're good: Theano already collapses elementwise expressions together (Fusion) so all of the neuron dynamics are already globbed together into a single compound Op.

For the matrix multiplies, Theano currently does not glob together the updates, but it could, and I think for this simulator to really go fast, it should. All of the encoder matrix multiplications can be done in parallel, and all the decoder ones as well. I don't think CUBLAS supports exactly the computation that we want, but rolling our own GPU kernel is not too hard. On the CPU the same is true - all 8 or 16 cores can be put to good use that way.

For a theano optimization to be able to recognize that several e.g. encoder matrices can be applied in parallel, it simply needs to recognize that there are no functional dependencies between any of their clients and their inputs... they don't actually all need to be in the same physical memory block. Still, they might as well be in the same block since that's what makes it easy to parallelize the neuron dynamics.

TL;DR - if profiling the evaluation of an ensemble reveals that matrix multiplication is the bottleneck, then there's room for improvement. If the neural dynamics updates are the bottleneck, then it's probably already running as fast as it can go.

This turned into kind of a musing on design rather than any specific kind of recommendation, but I'm posting it all the same. Hope y'all don't mind.

On Wed, Mar 13, 2013 at 12:57 PM, Eric Hunsberger notifications@github.comwrote:

The amount of speedup is going to depend on a lot of different factors, including size of the network, number of ensembles, and whether the CPU or GPU is used. The biggest speedup will likely be for large networks with lots of ensembles on the GPU. I think any speedups on the CPU will be modest.

— Reply to this email directly or view it on GitHubhttps://github.com/ctn-waterloo/nef-py/issues/4#issuecomment-14853988 .

tcstewar commented 11 years ago

Subnetworks should now work. As discussed above, they do not affect the theano implementation in any way -- they're just a convenience for defining networks.