SpiNNakerManchester / sPyNNaker

The SpiNNaker implementation of the PyNN neural networking language
Apache License 2.0
105 stars 43 forks source link

Change edge representation and change routing in PACMAN to remove dependence on keys #84

Closed rowleya closed 8 years ago

rowleya commented 9 years ago

In a recent discussion, it was decided that routing could be done without keys, and then the keys could be assigned after routing in such a way that makes compression trivial. To do this in PACMAN would require a change to what the routing requires as input, and what the routing produces as output, as well as how the routing is done internally.

An additional requirement is for each edge to become multicast representative; i.e. it should have a source node and a list of destination nodes (instead of just a single destination node). This should make routing easier to process, and removes the need for a "SameKeyAsEdgeConstraint", as the edge can now be assigned a single key even though it goes to multiple destinations.

The inputs for routing would become:

  1. Partitioned Graph
  2. Machine description
  3. Placements on the graph

The output would become: For each edge in the graph, a list of chips through which the edge traverses. This would include the starting chip and the end chips (as it is multicast). The list of chips could be ordered such that the first full path from source chip to one destination chip is listed, followed by a repeat of one of the chips representing a branch point, repeating this until all destination chips have been listed e.g. for an edge that goes from (0, 0) to (1, 2), (2, 3) and (3, 2) the list might be: (0, 0) (1, 1) (2, 2) <== First Branch (1, 2) (2, 2) <== Second Branch (2, 3) (2, 2) <== Third Branch (3, 2) to indicate a route that branches at (2, 2) in three directions (basically, any time a chip is listed a subsequent time for an edge, it means to return to that chip for a branch).

A final step would then have to be added to convert this list in to multicast routing tables, with an assignment of keys to be done at that time as well. This step would then have to take as input:

  1. Partitioned Graph
  2. Routing Output

and would produce as output:

  1. A series of multicast routing tables
  2. A set of key assignments for the edges
AlexRast commented 9 years ago

On 20/05/15 09:44, Andrew Rowley wrote:

In a recent discussion, it was decided that routing could be done without keys, and then the keys could be assigned after routing in such a way that makes compression trivial. To do this in PACMAN would require a change to what the routing requires as input, and what the routing produces as output, as well as how the routing is done internally.

Wholeheartedly agree. Routing SHOULD be done without keys, indeed. That sounds like a significant step forwards.

An additional requirement is for each edge to become multicast representative; i.e. it should have a source node and a list of destination nodes (instead of just a single destination node). This should make routing easier to process, and removes the need for a "SameKeyAsEdgeConstraint", as the edge can now be assigned a single key even though it goes to multiple destinations.

Hmm. I don't like that idea conceptually; an "edge" loses its graph significance. Better to call it a "subgraph" in my view - which in any case comes closer to saying what the router is doing; not routing edge by edge but subgraph by subgraph (much like P&R tools do in modern EDA suites)

The inputs for routing would become:

  1. Partitioned Graph
  2. Machine description
  3. Placements on the graph

The output would become: For each edge in the graph, a list of chips through which the edge traverses. This would include the starting chip and the end chips (as it is multicast). The list of chips could be ordered such that the first full path from source chip to one destination chip is listed, followed by a repeat of one of the chips representing a branch point, repeating this until all destination chips have been listed e.g. for an edge that goes from (0, 0) to (1, 2), (2, 3) and (3, 2) the list might be: (0, 0) (1, 1) (2, 2) <== First Branch (1, 2) (2, 2) <== Second Branch (2, 3) (2, 2) <== Third Branch (3, 2) to indicate a route that branches at (2, 2) in three directions (basically, any time a chip is listed a subsequent time for an edge, it means to return to that chip for a branch).

This representation seems overly cumbersome and prone to confusion. Can not the list of chips be hierarchical (a list of lists) - with each sublist being the tree from a particular branch point (which is the first entry in the sublist)? It also seems to me that even this could possibly be improved, because in fact all you really need is a list of branch points and terminals, not the intermediate chips. (This assumes that having been routed, no rearrangement of routes to aid key allocation is to be performed).

A final step would then have to be added to convert this list in to multicast routing tables, with an assignment of keys to be done at that time as well. This step would then have to take as input:

  1. Partitioned Graph
  2. Routing Output

and would produce as output:

  1. A series of multicast routing tables
  2. A set of key assignments for the edges

You will need a set of key assignments for the vertices, in fact, because these will have to be mapped down to population tables on each core. I also have some doubts about what might happen if a group of different edges ended up in wildly different key spaces within a single core - such that a core might have neurons with ID's , eg., 0x34560000-0x34560009, 0x11221140-0x112211BF, and 0x00001200-0x0000121F. Or some other such strange mixed mapping. This isn't an objection as such - just something to watch for.

— Reply to this email directly or view it on GitHub https://github.com/SpiNNakerManchester/sPyNNaker/issues/84.

rowleya commented 9 years ago

Note that an edge will still be representable as a single source-target object if this is required, but I don’t think we should get all tied to the graph representation.

I am happy to change the representation of the route – this was just a first go at a simple version. Having something more tree-like would be OK if it is easy enough to represent, both as a file structure and as a data structure (we are going to have to read the result from a file in general). Missing out the intermediates is fine, as long as it is obvious where you are going i.e. you can infer quickly the intermediate nodes.

I guess something like:

From: AlexRast [mailto:notifications@github.com] Sent: 20 May 2015 15:05 To: SpiNNakerManchester/sPyNNaker Cc: Andrew Rowley Subject: Re: [sPyNNaker] Change edge representation and change routing in PACMAN to remove dependence on keys (#84)

On 20/05/15 09:44, Andrew Rowley wrote:

In a recent discussion, it was decided that routing could be done without keys, and then the keys could be assigned after routing in such a way that makes compression trivial. To do this in PACMAN would require a change to what the routing requires as input, and what the routing produces as output, as well as how the routing is done internally.

Wholeheartedly agree. Routing SHOULD be done without keys, indeed. That sounds like a significant step forwards.

An additional requirement is for each edge to become multicast representative; i.e. it should have a source node and a list of destination nodes (instead of just a single destination node). This should make routing easier to process, and removes the need for a "SameKeyAsEdgeConstraint", as the edge can now be assigned a single key even though it goes to multiple destinations.

Hmm. I don't like that idea conceptually; an "edge" loses its graph significance. Better to call it a "subgraph" in my view - which in any case comes closer to saying what the router is doing; not routing edge by edge but subgraph by subgraph (much like P&R tools do in modern EDA suites)

The inputs for routing would become:

  1. Partitioned Graph
  2. Machine description
  3. Placements on the graph

The output would become: For each edge in the graph, a list of chips through which the edge traverses. This would include the starting chip and the end chips (as it is multicast). The list of chips could be ordered such that the first full path from source chip to one destination chip is listed, followed by a repeat of one of the chips representing a branch point, repeating this until all destination chips have been listed e.g. for an edge that goes from (0, 0) to (1, 2), (2, 3) and (3, 2) the list might be: (0, 0) (1, 1) (2, 2) <== First Branch (1, 2) (2, 2) <== Second Branch (2, 3) (2, 2) <== Third Branch (3, 2) to indicate a route that branches at (2, 2) in three directions (basically, any time a chip is listed a subsequent time for an edge, it means to return to that chip for a branch).

This representation seems overly cumbersome and prone to confusion. Can not the list of chips be hierarchical (a list of lists) - with each sublist being the tree from a particular branch point (which is the first entry in the sublist)? It also seems to me that even this could possibly be improved, because in fact all you really need is a list of branch points and terminals, not the intermediate chips. (This assumes that having been routed, no rearrangement of routes to aid key allocation is to be performed).

A final step would then have to be added to convert this list in to multicast routing tables, with an assignment of keys to be done at that time as well. This step would then have to take as input:

  1. Partitioned Graph
  2. Routing Output

and would produce as output:

  1. A series of multicast routing tables
  2. A set of key assignments for the edges

You will need a set of key assignments for the vertices, in fact, because these will have to be mapped down to population tables on each core. I also have some doubts about what might happen if a group of different edges ended up in wildly different key spaces within a single core - such that a core might have neurons with ID's , eg., 0x34560000-0x34560009, 0x11221140-0x112211BF, and 0x00001200-0x0000121F. Or some other such strange mixed mapping. This isn't an objection as such - just something to watch for.

— Reply to this email directly or view it on GitHub https://github.com/SpiNNakerManchester/sPyNNaker/issues/84.

— Reply to this email directly or view it on GitHubhttps://github.com/SpiNNakerManchester/sPyNNaker/issues/84#issuecomment-103898810.

alan-stokes commented 9 years ago

Howdi all,

so my thoughts on the matter are as follows:

  1. routing without keys: I concur, though the cavieat to this is instead of returning a list of chips a vertex routed though, how a collection of sets/routing tables where each entry is not a key, but a vertex identifier, you may also want the path as well, but my feeling is that your key/compression algorithums now become enwined, and thus your going to need to know the set of vertices on each routing table to make logical adjustments for, and your key allocator can become rather stimulated anneelingly, in that entries are adjusted adhockly to reduce entries if needed.
  2. edge to become multicast representative: even though i like the idea in principle, i actually think changing the represetnation is the wrong way to look at the problem. Heres the logic behind it:

A. mapping algorithums usually work on graphs, so having us with a strange represnetation of a graph may hinder us in using off the shelf algorithums. B. the interface of PyNN for edges is to build a projection, and we give it the same key constraint from the vertex through the use of a key constraint, this means we have the power to add this multicast data into the graph as metadata aka, when the edge is added to the graph, it adds to anther dict inside the graph with [source_vertex]=dict[KeyAllocatorSameKeysConstraint]=list(dest_vertex's), giving the graph a find_destinations_from_source(vertex) which returns the dictonary. This gives you the equiv of the multicast edges and subgraphs in one call.

I note that here the graph assumes the vertex AbstractProvidesOutgoingEdgeConstraints. But as every vertex of all our applications, currently does, im rather willing to keep that assumption.

C. The only time we've experienced a vertex which has had to have more than 1 set of outgoing key'd edges has been the command sender. Even the reverse_ip_tag_multi-cast_source keeps to one multi-cast edge outwards. Therefore rejigging the entire graph structure just to hold some extra meta-data seems like a lot of work for one special vertex. This means that the common output from this find_destinations_from_source(vertex) would be a dict with one entry, but allows future proofing (i could have made the heat demo use a seperate key for live output).

  1. For each edge in the graph, a list of chips through which the edge traverses.: I really dont like this approach, theres a reason we removed the path object from the router in the first place, it was always unuseful on its own, you always needed the routing tables to make any real use of it, apart from reporting purposes (which we found was problematic as we kept forgetting to update the path object). If we adjust what the "routing table" output of the router is, then i think we may be on a more solid ground, or at least, have the path and the proposed "routing table" structure as output.

    The summary im trying to get, is that we dont want to redo work to generate structures that have to be generated in previous code to work. aka the routing table strucutre would likely be needed to make routing bandwidth/congestion aware etc, therefore throwing it away to be rebuilt later seems wastful, the same can be said of the multi-cast edge, it can be built using our current strucutre with just some extra work on the graphs internal workings.

anyhow thats my thoughts. Alan

mossblaser commented 9 years ago

@rowleya: An additional requirement is for each edge to become multicast representative; i.e. it should have a source node and a list of destination nodes (instead of just a single destination node).

Agreed.

Incidentally, it is often the case in many models that you really have N:M connections (i.e. multiple sources multicast to multiple destinations) when you look at the top-level problem. The catch is that routes in SpiNNaker can't be N:M in the general case. See the top of project-rig/rig#153 for a more detailed description of this problem with pictures. Though N:M nets can always be decomposed into N individual 1:M nets, doing so (in principle) discards information which a placer and router may be able to use to its advantage.

My gut feeling is that though N:M nets are nice-to-have, the added complexity is not worth it in a first-cut of this spec. Does anybody else have any strong thoughts to the contrary?

@AlexRast: Hmm. I don't like that idea conceptually; an "edge" loses its graph significance.

In the chip-design place & route space the type of graph where you have connections from single vertices to multiple other vertices is frequently referred to as a hyper-graph or a netlist in most of the papers I've read. Further, people throw around the terms "edge", "hyper-edge" and "net" interchangeably, all meaning single-source, multiple-sink connections.

As Alex points out, we are at risk of further overloading the word "Edge" here. From looking at codebases for academic place & route tools, "net" appears a popular term to use. This lacks the additional connotations that 'subgraph' has. I'd propose using the term "net" to describe this type of connection.

@AlexRast: This representation seems overly cumbersome and prone to confusion. Can not the list of chips be hierarchical (a list of lists) - with each sublist being the tree from a particular branch point (which is the first entry in the sublist)?

I heartily agree with this. Rig uses such a tree structure and from experience it matches the problem very well. Specifically, when thinking about routes in SpiNNaker, they are (necessarily) a tree; anything else is not a valid route. A trivial tree-traversal algorithm can be used to generate routing table entries from such trees (skipping default routes as you go). Further I've found whipping up quick functions to calculate metrics about route quality has been very easy for this structure. It also works really well when generating place-and-route diagrams.

As an additional point; it is not sufficient to simply have each node in the tree be an (x, y) coordinate since in small SpiNNaker systems it is ambiguous as to which link should be used between a pair of chips. As a concrete example, in a 2x2 system, to get from (0, 0) to (1, 1) you can go either North-East or South-West. To solve this problem you can simply label each branch in the tree with the link to be traversed.

@AlexRast: It also seems to me that even this could possibly be improved, because in fact all you really need is a list of branch points and terminals, not the intermediate chips. @alan-stokes: For each edge in the graph, a list of chips through which the edge traverses.: I really dont like this approach, theres a reason we removed the path object from the router in the first place, it was always unuseful on its own, you always needed the routing tables to make any real use of it, apart from reporting purposes (which we found was problematic as we kept forgetting to update the path object).

Though I certainly see some of the motivation, I'd be in favour of keeping all hops of a route in the representation used rather than removing all but the corners. In particular:

Alan: I'm not convinced by your arguments here I'm afraid.

@alan-stokes: A. mapping algorithums usually work on graphs, so having us with a strange represnetation of a graph may hinder us in using off the shelf algorithums.

This may not be true elsewhere but certainly within chip/circuit place & route algorithms, the concept of single-source multiple-sink connections is universal.

@alan-stokes: this means we have the power to add this multicast data into the graph as metadata

But this information is data, not metadata though! Multiple-sink connections are a fundamental part of the problem description, especially for routers.

The only time we've experienced a vertex which has had to have more than 1 set of outgoing key'd edges has been the command sender.

Perhaps I'm misinterpreting your statement but, as I understand, SpiNNaker is designed around the idea of multicast messages forming the basis of most communication. To give a few concrete examples.

Say you have a neural network with population A being all-to-all connected to neurons in population B:

A ====> B

If A gets "partitioned" into one core worth of work and B into 3, you get a netlist/hypergraph like so:

a0 ---------> b0
      \
       -----> b1
        \
         ---> b2

Here you have a single connection from a0 which goes to b0, b1 and b2 which would logically be represented by a net/hyper-edge with a single source and multiple sinks. Breaking this into three separate 1:1 connections results in up to 3x as much network resource being used for the same connectivity.

As I understood, this was a common pattern in PyNN (?) and it is certainly very common in Nengo.

The summary im trying to get, is that we dont want to redo work to generate structures that have to be generated in previous code to work. aka the routing table strucutre would likely be needed to make routing bandwidth/congestion aware etc, therefore throwing it away to be rebuilt later seems wastful, the same can be said of the multi-cast edge, it can be built using our current strucutre with just some extra work on the graphs internal workings.

I think it is probably worth stating that the objective of the work proposed at the P&R workshop wasn't to redesign PACMAN's internals but rather to define a common (and high-level) method of sharing placement and routing problems and solutions. Some translation will be required by all parties and so internal implementation details should not be an issue here. Certainly, there's no pressure nor expectation that people should rewrite their software internals to match this interchange format. :)

alan-stokes commented 9 years ago

Hi Jonathan.

So a few things to wrap up your points.

@rowleya: An additional requirement is for each edge to become multicast representative; i.e. it >should have a source node and a list of destination nodes (instead of just a single destination >node). Agreed.

Incidentally, it is often the case in many models that you really have N:M connections (i.e. multiple >sources multicast to multiple destinations) when you look at the top-level problem. The catch is >that routes in SpiNNaker can't be N:M in the general case. See the top of project-rig/rig#153 for a >more detailed description of this problem with pictures. Though N:M nets can always be >decomposed into N individual 1:M nets, doing so (in principle) discards information which a placer >and router may be able to use to its advantage.

My gut feeling is that though N:M nets are nice-to-have, the added complexity is not worth it in a >first-cut of this spec. Does anybody else have any strong thoughts to the contrary?

I'm of the mind that having the collection of 1:M (which is what we currently have) is suitable for what we're aiming for, and it should be possible to backwards discover the N:M from the data structures if such an algorithum requires it.

@AlexRast: Hmm. I don't like that idea conceptually; an "edge" loses its graph significance. In the chip-design place & route space the type of graph where you have connections from single >vertices to multiple other vertices is frequently referred to as a hyper-graph or a netlist in most of >the papers I've read. Further, people throw around the terms "edge", "hyper-edge" and "net" >interchangeably, all meaning single-source, multiple-sink connections. As Alex points out, we are at risk of further overloading the word "Edge" here. From looking at >codebases for academic place & route tools, "net" appears a popular term to use. This lacks the >additional connotations that 'subgraph' has. I'd propose using the term "net" to describe this type >of connection.

Myself, and Rowley have done a bunch of searches on graph theory to discover what the offical term is for the equiv of a multi-edge, and we came to the conculsion that it just doesnt exist. This is because the atomic parts of a graph are nodes and edges, which means the multi-edge has to be represeted as a set of edges. This said, the set of outgoing edges is a set, and the multi-edge is actually a partition of said set. Having had a chat with IMG people (who use graphs all the time to solve a mutlitude of problems) the best support for this is a partition of the set into mutal subsets, which ive called out_going_edge_partitions in the tool chain. These changes can be found in the #Mapping_interface_changes branches spread over spynnaker, front end common, pacman, external devices. Using this has cleaned up the code rather well, and it a very usful tool (I accept this whole heartily).

@AlexRast: This representation seems overly cumbersome and prone to confusion. Can not the list of chips be hierarchical (a list of lists) - with each sublist being the tree from a particular branch point (which is the first entry in the sublist)? I heartily agree with this. Rig uses such a tree structure and from experience it matches the >problem very well. Specifically, when thinking about routes in SpiNNaker, they are (necessarily) a >tree; anything else is not a valid route. A trivial tree-traversal algorithm can be used to generate >routing table entries from such trees (skipping default routes as you go). Further I've found >whipping up quick functions to calculate metrics about route quality has been very easy for this >structure. It also works really well when generating place-and-route diagrams.

As an additional point; it is not sufficient to simply have each node in the tree be an (x, y) >coordinate since in small SpiNNaker systems it is ambiguous as to which link should be used >between a pair of chips. As a concrete example, in a 2x2 system, to get from (0, 0) to (1, 1) you >can go either North-East or South-West. To solve this problem you can simply label each branch in >the tree with the link to be traversed.

the routing paths that are within the branches descirbed above still work off basic edges, and so they are not tree's. This is mainly due to our router only working off edges. Maybe theres scope here for changes, but you'd need to discuss it with me in person to sell it to me. Though maybe you can use the new outgoingedge partitions to get your tree?

@AlexRast: It also seems to me that even this could possibly be improved, because in fact all you really need is a list of branch points and terminals, not the intermediate chips. @alan-stokes: For each edge in the graph, a list of chips through which the edge traverses.: I >really dont like this approach, theres a reason we removed the path object from the router in the >first place, it was always unuseful on its own, you always needed the routing tables to make any >real use of it, apart from reporting purposes (which we found was problematic as we kept >forgetting to update the path object).

Though I certainly see some of the motivation, I'd be in favour of keeping all hops of a route in the >representation used rather than removing all but the corners. In particular:

It is only a last-minute step in only some algorithms (certain types of routing table generation) >which benefits from removal of intermediate hops. Other useful tasks e.g. static analysis of network congestion would have to undo this. Congestion awareness in routers may well result in less direct paths in many cases. Concretely: The Southampton routing table minimiser requires every hop have a routing entry >as the starting point. It is very easy to verify the correctness of routes when all steps are present. It is very easy to remove the straight-line segments when actually needed.

having written the reports for the routing paths object from the branch discussed above yesterday, i can tell you that losing the router to router context would make reproting harder, and would have made the key alloc and router table generation much more diffcult, as default routing is not a standard behaviour in routing, and so keeping the hop per hop process will allow much more general algorithums to be applied to this problem. Thus I vote to keep hop to hop, and the current impl still keeps this.

Alan: I'm not convinced by your arguments here I'm afraid.

@alan-stokes: A. mapping algorithums usually work on graphs, so having us with a strange >represnetation of a graph may hinder us in using off the shelf algorithums.

This may not be true elsewhere but certainly within chip/circuit place & route algorithms, the >concept of single-source multiple-sink connections is universal.

As indicated a few times already, chip design is a completely different kettle of fish and decided to go away from formal graph theory represnetations for their own reasons. As we have our own strong reasons for keeping to a graph representation, I'd believe that keeping this assumption will make life easier for us, esp as at least the concepts you've currently asking for are easily avilable as meta data concepts above the atomics of a graph, and we can support these reasonabilly easily (as shown in the forementioned branch).

But this information is data, not metadata though! Multiple-sink connections are a fundamental part >of the problem description, especially for routers.

The only time we've experienced a vertex which has had to have more than 1 set of outgoing >key'd edges has been the command sender.

Perhaps I'm misinterpreting your statement but, as I understand, SpiNNaker is designed around >the idea of multicast messages forming the basis of most communication. To give a few concrete >examples.

Say you have a neural network with population A being all-to-all connected to neurons in >population B:

A ====> B

If A gets "partitioned" into one core worth of work and B into 3, you get a netlist/hypergraph like so:

a0 ---------> b0 \ -----> b1 \ ---> b2

Here you have a single connection from a0 which goes to b0, b1 and b2 which would logically be >represented by a net/hyper-edge with a single source and multiple sinks. Breaking this into three >separate 1:1 connections results in up to 3x as much network resource being used for the same >connectivity.

As I understood, this was a common pattern in PyNN (?) and it is certainly very common in Nengo.

The summary im trying to get, is that we dont want to redo work to generate structures that >have to be generated in previous code to work. aka the routing table strucutre would likely be >needed to make routing bandwidth/congestion aware etc, therefore throwing it away to be rebuilt >later seems wastful, the same can be said of the multi-cast edge, it can be built using our current >strucutre with just some extra work on the graphs internal workings.

So I think your mixing problem space with machine space here. The problem space is described in a graph, which has atomic objects of nodes and edges. This means anything above these concepts is metadata (albeit metadata which can be generated from the basic data struct, but is not apart of the data struct itself). Now the forementioned branch breaks this logic slightly by desciding to push semantics (out-going-edge-partition-id) into the graph objects, but given the power it gives us, myself and Rowley are happy to give this consideration, and as its still within formal graph theory, its not like we're reinventing the wheel.

your example would start as 3 seperate edges, but the router knows that all edges from a soruce can go down all ready travelled paths for its soruce, and so results in merging of router entries down stream. Now given the out-going-edges-partition, you could use that to get your same key based edges, as thats how we're using it currently. Im not arguing that its not a good thing, but that we shouldnt change the atomic objects to suit, we should make the metadata to support these requirements, such as the outgoing-edge-partitions.

I think it is probably worth stating that the objective of the work proposed at the P&R workshop >wasn't to redesign PACMAN's internals but rather to define a common (and high-level) method of >sharing placement and routing problems and solutions. Some translation will be required by all >parties and so internal implementation details should not be an issue here. Certainly, there's no >pressure nor expectation that people should rewrite their software internals to match this >interchange format. :)

The reason we went down the road of removing keys from router and generating them afterwards was becuase we could see the gains we could get rather cheaply by doing so, router compression being the key one. The fact that this coinsided with what people wanted, all the better. The outgoing-edge-parittions were also aimed partly for the support you guys wanted, but also because it cleaned up our own key alloc sigfincately and allowed the graph front end to generate its edges in a much cleaner way (having made that has given us a bunch of insights whcih we just hadnt had before building it). Im about to start looking at the common high-level method of sharing. The result from the workshop was the decision to use xml files as the communication fabric so I can tie our tool chain to start that framework for this now, and give you a xml schema which will then allow you guys to discuss and mull over something concrete and which works already, and move forward. Without any momentum, nothing will get done afterall.

AlexRast commented 9 years ago

Comments inlined with edited history.

On 24/06/15 09:20, Alan Stokes wrote:

Hi Jonathan.

So a few things to wrap up your points.

@rowleya <https://github.com/rowleya>: An additional requirement is for each
edge to become multicast representative;...

My gut feeling is that though N:M nets are nice-to-have, the added
complexity is not worth it in a >first-cut of this spec. Does anybody else
have any strong thoughts to the contrary?

I'm of the mind that having the collection of 1:M (which is what we currently have) is suitable for what we're aiming for, and it should be possible to backwards discover the N:M from the data structures if such an algorithum requires it.

1:M connections is almost certainly good enough. I also favour this method because it makes the descriptions much easier to use with databases.

@AlexRast <https://github.com/AlexRast>: Hmm. I don't like that idea
conceptually; an "edge" loses its graph
significance.

... From looking at >codebases for academic place & route tools, "net" appears a popular term to use. This lacks the >additional connotations that 'subgraph' has. I'd propose using the term "net" to describe this type >of connection.

Myself, and Rowley have done a bunch of searches on graph theory to discover what the offical term is for the equiv of a multi-edge, and we came to the conculsion that it just doesnt exist. This is because the atomic parts of a graph are nodes and edges, which means the multi-edge has to be represeted as a set of edges. This said, the set of outgoing edges is a set, and the multi-edge is actually a partition of said set.

I think this is right; I don't like the idea of tossing away all of graph theory which gives us nice formal models that we can use and going for an ad-hoc approach which has to rely too much on heuristics. In the EDA/P&R space the latter is what they've gone for, for various reasons but the key point there is that they are dealing with fixed wire links rather than routes in a topology. For the EDA world it's awkward to represent a branch point in a route as a node, but for SpiNNaker, by definition each branch point is a node.

@AlexRast <https://github.com/AlexRast>: This representation seems overly
cumbersome and prone to confusion. Can not the
list of chips be hierarchical (a list of lists) - with each sublist being the
tree from a particular branch point (which is the first entry in the sublist)?
I heartily agree with this. Rig uses such a tree structure and from
experience it matches the >problem very well.

As an additional point; it is not sufficient to simply have each node in the
tree be an (x, y) >coordinate since in small SpiNNaker systems it is
ambiguous as to which link should be used >between a pair of chips.

the routing paths that are within the branches descirbed above still work off basic edges, and so they are not tree's. This is mainly due to our router only working off edges. Maybe theres scope here for changes, but you'd need to discuss it with me in person to sell it to me. Though maybe you can use the new outgoingedge partitions to get your tree?

I think the router should work with partitions. Indeed, this works very nicely, because you can then distribute the problem of deciding routing entries: each separate partition can be thought of as an independent process. I definitely think it should work using such partitions, arranged hierarchically, in fact, so that a partition is itself split into sub-partitions, and so on. By using partitions you vastly increase the ability of routers to coalesce common links into a single routing entry and thus save space.

I think it is probably worth stating that the objective of the work proposed
at the P&R workshop >wasn't to redesign PACMAN's internals but rather to
define a common (and high-level) method of >sharing placement and routing
problems and solutions. Some translation will be required by all >parties
and so internal implementation details should not be an issue here.
Certainly, there's no >pressure nor expectation that people should rewrite
their software internals to match this >interchange format. :)

The reason we went down the road of removing keys from router and generating them afterwards was becuase we could see the gains we could get rather cheaply by doing so, router compression being the key one.

It should be noted, that this is the result of a well-known maxim from physics: that the greatest efficiencies (and insights) are usually gained by an astute choice of representation; by clever juggling of the representation you can usually achieve far better compression and processing speed than any other form of problem optimisation.

There is a penalty to be paid, in this case: when keys are abstracted away from a population-specific format (you are essentially doing a dimensional transform to a new series of basis vectors for representing neurons, which may not coincide evenly with populations), it becomes more difficult for external applications which rely on processing specific subsets of neurons to be able to determine which neurons to process. One way to fix this is to have a specific type of population for talking to external processes which creates a projection from any desired subset of neurons to the subset required for the external application (which is, itself, just another address remapping) but we don't have that type yet. I have been working on this intermittently for some time and it does now sound as though that could become increasingly useful. That doesn't solve every problem though: if for instance you want to identify suspicious subgraphs of your network that may be causing the model to fail, either because of hardware limitation (e.g. spike loss) or because of model limitations (e.g. an inaccurate biological model for some key submodule) this problem could become VERY difficult to analyse (because you're not always going to know WHICH subset is acting suspiciously).

— Reply to this email directly or view it on GitHub https://github.com/SpiNNakerManchester/sPyNNaker/issues/84#issuecomment-114778710.

alan-stokes commented 9 years ago

completed and awaiting merging

rowleya commented 8 years ago

This has been merged.