More flexible monitoring

thesamovar commented 10 years ago

We should consider whether or not there is a syntax for monitors that covers all the different types in Brian 1 in just a few classes that would work well with standalone, etc.

In particular, we might want to include the functionality of StateSpikeMonitor from Brian 1 into SpikeMonitor (or EventMonitor depending on how we handle multiple NeuronGroup events).

mstimberg commented 10 years ago

The current situation is the following:

Brian1 SpikeMonitor, SpikeCounter, PopulationSpikeCounter. StateMonitor, StateSpikeMonitor, MultiStateMonitor, RecentStateMonitor, AERSpikeMonitor, FileSpikeMonitor, ISIHistogramMonitor, PopulationRateMonitor, VanRossumMetric, CoincidenceCounter, StateHistogramMonitor

Brian2

StateMonitor (includes the functionality of MultiStateMonitor but does not record mean and variance for every cell in the group as in Brian1)
SpikeMonitor (doesn't currently allow to specify a custom recording function)
PopulationRateMonitor (does not allow for specifying a bin size)

Let's first think about it from the user's perspective and ignore the implementation side (many of those classes do not need much implementation anyway, e.g. a SpikeCounter is basically Synapses(G, G, 'counter:1', pre='counter+=1', connect='i==j')).

Some random observations/remarks:

I don't think there's a need for PopulationSpikeCounter if we already have a SpikeCounter
Maybe we should try to classify the monitors in general categories: SpikeMonitor and StateSpikeMonitor are recording discrete events, separately for each neuron. The data structure is therefore a list of tuples. StateMonitor and most other monitors are recording at every timestep, in the same way for all (recorded) neurons, the data structures are dynamic arrays of the same length. For the latter class, there are monitors that store a 1:1 mapping (e.g. StateMonitor), others that use a N:1 mapping (e.g. PopulationRateMonitor) and others doing N:M mappings (e.g. StateHistogramMonitor). Finally, most monitors look at single time steps in isolation but some need a temporal context (e.g. ISIHistogramMonitor)
We should have some general mechanism for switching the "storage" of the monitored values. The default option would be "in memory" but FileSpikeMonitor and AERSpikeMonitor would be implemented via this mechanism. Maybe we could also make the default storage a device property, so standalone would default to writing to disk? But then, we need specific templates for standalone anyway, so maybe there's not much need for this. The storage option might also be "print to screen" (or some fancy interactive visualization)?
The last point makes me wonder whether we should refactor monitors in the way so that the actual recording is separate from the storage.

thesamovar commented 10 years ago

For the classification, I agree with your division into events/continuous, and the subdivision of continuous into different mappings 1:1, N:1, N:M.

Also like your idea of different types of storage. This potentially simplifies and generalises things: ideal.

I don't see how we can separate recording from storage though. Unless you're thinking that recording makes a temporary copy into memory, and then storage takes that representation and puts it in a more permanent data structure (memory/file). We could do that, but I'm not sure how much it gains us? Let's see about that when we have a clearer design from the user perspective maybe.

So do you want to work on a general design? I don't have any strong feelings about how this should work so I'm pretty open to any possibilities. Some thoughts below.

For discrete event monitors, I see basically two types of operations we might want to consider: create and modify variables in response to events (this could be implemented by Synapses, or even simpler if we are doing arbitrary event types anyway then it would be done through this); store tuples of values. So a simple syntax would be:

EventMonitor(source, eqs=None, pre=None, store=None)
# example for SpikeMonitor
EventMonitor(G, store='(i, t)') # simple but no named access to stored variables
# alternative syntax for SpikeMonitor
EventMonitor(G, store={'i': 'i', 't':'t'}) # flexible but ugly
# another one
EventMonitor(G, store=('i', 't')) # less flexible but simpler, I think I'd go with this one
# example for SpikeCounter
EventMonitor(G, eqs='counter:1', pre='counter+=1')

For continuous monitoring it's more complicated. For 1:1 it's straightforward, you can just write expressions in standard codegen syntax. But for other mappings the current codegen syntax doesn't apply so we would need to create a whole new syntax. Do we want to do this? Is it worth creating a complex framework or do we just provide a couple of basic ones like sum? I don't have a good idea here.

For storage, I suggest we just have a storage='string value' argument that gets passed to the template?

mstimberg commented 10 years ago

I like the store keyword in the last variant, this would also allow for the straightforward StateSpikeMonitor:

EventMonitor(G, store=('i', 't', 'v'))

alternatively, we could reduce the use of strings somewhat and use object references instead:

EventMonitor(G, store=(G.i, G.t, G.v))

but I don't see much use for that. Maybe it makes sense to make the syntax a bit more consistent with StateMonitor by using variables instead of store? I wonder whether we need the SpikeCounter at all. According to the docs (but this is actually not implemented), setting record=False should only record the spike counts.

Maybe we should have a more straightforward, simple system:

You don't have to specify i and t for EventMonitor -- this is a "label" that is recorded for every event
Recording of every event (or timestep for StateMonitor) is switched off by setting record to False, in this case only summary statistics are recorded, a count for the labels and the mean (and variance?) for every additional variable.

Some examples for this system:

# SpikeMonitor
EventMonitor(G)
# SpikeCounter
EventMonitor(G, record=False)
# StateSpikeMonitor
EventMonitor(G, variables=['v'])
# StateSpikeMonitor only storing the average membrane potential at the time of the spike
EventMonitor(G, variables=['v'], record=False)

This all is triggered on spikes, if we have additional events we could add an event keyword to trigger recording on other events. On the other hand, that means that we have to store events in the same way we store spikes which would not be necessary if we have a merged threshold/reset codeobject for general events. Alternatively, EventMonitor could have its own threshold/event keyword which allows to specify a condition for recording. This has the advantage that it allows for recording triggered on events that don't change anything for the neuronal dynamics, e.g. you could record the membrane potential whenever it is close to the threshold without having to change the NeuronGroup. The disadvantage of this mechanism is that we'd need two EventMonitor templates, one based on the _spikespace array, one on arbitrary threshold conditions. Another nice addition would be an interval keyword which would allow to record the membrane potential e.g. 2ms after a spike. We'd need some additional label for which recorded entries are triggered by the event itself and which are only recorded because of the interval argument. Thinking about it, it would be even nicer to have an interval that can start before the event. The dynamics of the membrane potential, synaptic conductances, etc. just before a spike is something that is of quite some interest -- we don't have any mechanism for that in brian1, do we? We would have to continuously record into a circular array and then copy things over to the storage structure -- not trivial but certainly doable.

For storage, I suggest we just have a storage='string value' argument that gets passed to the template?

Maybe we could rather implement it via the Function mechanism? I.e. the template calls a storage function in some standard format that can be implemented in different ways? I think this has a couple of advantages: the template is less complicated, we can add new storage without changing the templates and we get a nice error message if the mechanism is not implemented for a target.

This is already quite long, I'll add some thoughts on the continuous recording later (maybe we should create a separate issue for those monitors?).

thesamovar commented 10 years ago

For the first half of your suggestion I think it more or less boils down to having a more flexible StateSpikeMonitor, right? i.e. we would get rid of the eqs and pre bit and just keep the variables (or store) keyword? We would also add counting and mean/std as standard. This makes the design simpler but at a slight loss of flexibility. On the other hand, maybe you don't need that flexibility because you can always do it by hand if you want to (by adding variables to the group, etc.). I'd be happy with that I think.

For EventMonitor defining its own events we wouldn't need two templates, we'd just add that event type to the NeuronGroup, it would just be syntactic sugar for creating the event in the NeuronGroup. We should also have SpikeMonitor as syntactic sugar for EventMonitor(event='spike').

The interval ideas are interesting and could indeed be quite useful for people doing spike triggered averages for example. I'm not sure how much effort it's worth putting into that. It's a nice feature, but as you say it's potentially a little complicated to implement. We had something a bit similar in Brian 1, the RecentStateMonitor which always has a copy of the last 5ms (or whatever) of a state variable you choose, and the implementation used a circular array (or cylindrical array depending on your point of view). Maybe if it's not too difficult to implement it's worth doing but otherwise not? Implementing RecentStateMonitor would be pretty straightforward since we just create a 2D array and a current time pointer. The question of implementation difficulty is not how difficult it would be to write in one case, but whether or not it overcomplicates the design of everything to try to include it, and how easily it fits with codegen/standalone.

Happy to implement storage via the function mechanism and to create separate issues for discrete and continuous monitor types.

mstimberg commented 10 years ago

For the first half of your suggestion I think it more or less boils down to having a more flexible StateSpikeMonitor, right?

Yes, in this framework I see SpikeMonitor simply as a StateSpikeMonitor that is not recording any state variable.

This makes the design simpler but at a slight loss of flexibility. On the other hand, maybe you don't need that flexibility because you can always do it by hand if you want to (by adding variables to the group, etc.). I'd be happy with that I think.

Yes, we have to decide on some flexibility-simplicity tradeoff. I think with the full eqs and pre syntax, we might be duplicating a bit too much of the NeuronGroup/Synapses functionality and the distinction between having a state variable in the monitor and a state variable in the neuron group will not be very straightforward. I could think of one more functionality (but we can also add this at a later point, of course) that would buy us some more flexibility: instead of fixing the summary statistics to mean/std, we could make this configurable:

EventMonitor(G, variables=['v'], record=False, summary=['mean', 'std', 'min', 'max']

with summary=['mean', 'std'] (or variance) as the default. And we would use the same keyword/semantics for StateMonitor.

For EventMonitor defining its own events we wouldn't need two templates, we'd just add that event type to the NeuronGroup, it would just be syntactic sugar for creating the event in the NeuronGroup.

This does then necessarily mean we go for the event implementation that stores events. But I guess performance-wise this is quite negligible anyway so it doesn't hurt. I'm not sure whether it's a good idea for the monitor to add something to the NeuronGroup object, though. I think as a general principle, an object should be responsible for defining all of its own properties. We might also encounter some subtle problems, e.g. what happens if a monitor that added an event is no longer used, etc. The only advantage that I'm seeing for your proposal (and my original two template approach) is that you can add something easily to a NeuronGroup that you can't/don't want to change (e.g. one that is returned from a function). But even that wouldn't really matter if NeuronGroup has an add_event method (which we would need anyway for your approach). My feeling is to use the simplest solution and to only allow to connect to existing events (well, we first need to implement #105 anyway :) ). Syntax-wise, both proposals are reasonably simple:

# EventMonitor adds the event
G = NeuronGroup(..., threshold='v>v_t', reset='v=0')
close_to_threshold_mon = EventMonitor(G, event='v>v_t-0.1 and v<=v_t', variables=['v'])
# Event defined in NeuronGroup
G = NeuronGroup(..., threshold='v>v_t', reset='v=0',
                events={'close_to_threshold': ('v>v_t-0.1 and v<=v_t', '')})
close_to_threshold_mon = EventMonitor(G, event='close_to_threshold', variables=['v'])

We should also have SpikeMonitor as syntactic sugar for EventMonitor(event='spike'). Fully agree.

I agree that the interval thing might be a bit too complicated (at least for now). About RecentStateMonitor: that wasn't really used for monitoring as such but rather for use in network operations, etc., right? For me, this monitor is not really that important but the more event-based interval recording would be a really neat feature, I guess.

thesamovar commented 10 years ago

Looks like we're getting close to agreement. My feeling is that maybe for now we should:

Implement arbitrary events #105, since we might make decisions there that will change what we do here
Implement EventMonitor with the variables and record keywords, but not the summary initially. We can add the summary statistics later, and we might want to think a bit more about them, because they are reductions and we might have better thoughts on this after we've thought about continuous monitoring.
Use only pre-defined events initially (i.e. events defined by the group, as you suggested), that's always something we can come back to.
Do not worry about intervals for the moment, that's also something we can come back to later.

Would you agree with that?

mstimberg commented 10 years ago

Sounds like a very good plan to me.

And I agree that we want to put some more thought into the summary stuff, there's a wide spectrum of possibilities from allowing a small set of explicit reductions (mean, max, etc.) up to some complex string based expressions (e.g. 'sqrt(mean(x**2))') which would have to be transformed into an "online" update statement.

thesamovar commented 10 years ago

Great. Should we close this issue and open two new ones for discrete and continuous monitoring?

mstimberg commented 10 years ago

Great. Should we close this issue and open two new ones for discrete and continuous monitoring?

Yes, let's do that.

thesamovar commented 10 years ago

Also meant to say it might be worth looking at Theano, numexpr and pycuda on the subject of reductions, since they all handle them. I think numexpr goes for the simplest solution (fixed set) and pycuda the most general.

mstimberg commented 10 years ago

Maybe let's open another issue on the issue of reductions, this will also be relevant for doing reductions over the population, e.g. to have a PopulationStateMonitor that allows for recording the average/maximum/etc. membrane potential in the population at every timestep.

thesamovar commented 10 years ago

OK you want to open the new issues or shall I? (Don't want to do it simultaneously.)

On the subject of reductions, is there potentially a case for supporting them in non-monitoring code? For example, there was a recent message on the Brian list where someone wanted to include a reduction in the equations. My feeling is no, but worth considering?

mstimberg commented 10 years ago

OK you want to open the new issues or shall I? (Don't want to do it simultaneously.)

Go ahead, I might already be off soon.

There might be some use cases for reductions like the one on the list, e.g. normalizing over all incoming synapses. But I don't really have a good idea how this could be combined with our standard code string syntax. I think for the moment we should only think of simple cases (such as a population monitor) where no confusion can arise whether code is calculated on a per-neuron (or per-synapse) basis or a reduction.

thesamovar commented 10 years ago

OK new issues opened, closing this one.

brian-team / brian2

More flexible monitoring #168