On Routing, Data Specification and Graph Edge Filtering

rowleya commented 6 years ago

We have observed that in (at this point specifically) neural network applications, it might make sense to filter the graph after the data has been generated. This could work as follows: 1) Do partitioning, placement and routing and key allocation without any edge filtering or compression 2) Do data specification generation and execution. 3) Get any information that might help with the filtering of edges (e.g. if additional data generation is done on the machine, you might have to retrieve a connection count for each edge). 4) Filter edges from the graph, removing them from the routing tables as you go. Note that this might result in vertices which used to have outgoing edges, but no don't have any, and so should not transmit any keys. 5) Do routing compression and load the routes. 6) Load the binaries (but leave them paused i.e. don't do sync0) 7) Send a message to those vertices which shouldn't send keys any more. 8) Start the application as usual.

Note that there is a contention here between data generation, at which point we tell the application that it has a key and should send it, and filtering, where we might change our minds. The above mechanism is a suggested way in which this can be solved. Note also that this would have to be tested with "reset", where data and binaries are reloaded; the message would have to be resent in this case to re-disable the sending of keys.

alan-stokes commented 6 years ago

this sounds overly complicated, and slow, compared to our current approach. You also fail to explain the use case where ours currently fails. Filtering at the earliest opportunity seems the sensible approach, and i don't get why we'd do otherwise.

rowleya commented 6 years ago

The missing information here is that in some graphs, the final sub-atom connectivity is not determined until after data generation, thus when filtering is done, this information is unknown. Unfortunately, as usual, there is a (currently potential) circular dependency - filtering needs to be done before routing, which needs to be done before routing key generation (ideally since key generation could take route into account), which needs to be done before data generation, which needs to be done before filtering!

A specific use-case is the generation of synapses in sPyNNaker. Here, the more detailed use case is when a Projection is provided with a low-probability of neuron connectivity, which results in machine edges that have no actual synapses in some cases. These cannot be determined prior to data generation, so a second round of filtering would be ideal.

SpiNNakerManchester / SpiNNFrontEndCommon

On Routing, Data Specification and Graph Edge Filtering #313