Which all components should react to reset signal?

numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

http://numenta.org/

GNU Affero General Public License v3.0

6.34k stars 1.56k forks source link

Which all components should react to reset signal? #1954

Open breznak opened 9 years ago

breznak commented 9 years ago

By reset signal i mean:

direct call from model to reset()
reading r in reset "column" when OPF parses data file
- currently only TP reacts to it?

TODO check that:

[ ] encoders with learning react to it?
[x] TP resets
[x] SP is state-less, right?
[ ] Classifier ??
[ ] Anomaly can react
- [ ] MovingAverage should be reset
- [ ] how (much) reset likelihood?

IDEAS:

[ ] should we differentiate signals for 2 resets?
- "soft" - TP end learning current sequence; assume new seq is similar (same source) as prev.
- "hard" - assuming new sequence can be completely different,

breznak commented 9 years ago

even if it was handeled in "reset - likelihood", this creates a problem. Hypothesis: Training on dataset with multiple sentences {2 kinds of sequences: X, Y} separated by sequence reset is iid = ordering of sequences does not affect quality of trained model.

We should validate this for

CLA alone
CLA using likelihood

where X,Y are different sequences of len=100 (ie.) I think training on X,Y,X,Y,.... and X_1, ..., X_1000, Y_1,...,Y_1000 would create different results (due to the "reevaluate distribution on the latest window" function in likelihood) ??

subutai commented 9 years ago

Hmm, interesting question. The likelihood average should probably be reset when there is a sequence reset, and then maybe ignored for a few time steps? In other words the first N steps of each new sequence would not generate meaningful likelihoods.

cogmission commented 9 years ago

Interestingly Subutai, this is a use case for the functional reactive library I was describing before.

If you have a caller that must react to every call by handling some output, (even when you have moving windows which don't output anything useful for some period) - you must handle this in your client code, making the client code really knowledgable of the context of every call to the target methods.

However, if we use a FRP Library such as RxJava (there is a Python and C++ version too), the chain of clients taking subsequent output don't receive anything until something useful is available, then they do their processing without any knowledge of how many items to "skip" or any such "God" perspective observation.

Not only this, but all "objects" along the method call chain are atomic and asynchronously protected.

Imagine this. You have a chain of functions like: x,y --> (x + y), (x / y), ([the 4th x later] + y), (x + [y if it ever becomes available]) where each pair of parenthesis is another function.

Now say you have a thread which is processing the user's mouse movements across the screen.

The usual way to deal with this is to have semaphores and thread blocking (to make sure you get a legitimate value out of x and y).

With FRP you can just get the value of x and y whenever and don't care how much the user's mouse is moving or anything - plus there's no thread contention, deadlocks or race conditions - whatever x and y happen to be at the time is what get applied with each function and each function doesn't care about the previous or next.

Imagine combining Network nodes with this semantic? We could have all sorts of asynchronicity (if that's a word) - and not care...

I think this could be interesting, especially for streaming clients... ;-)

David

On Fri, Mar 27, 2015 at 3:55 PM, Subutai Ahmad notifications@github.com wrote:

Hmm, interesting question. The likelihood average should probably be reset when there is a sequence reset, and then maybe ignored for a few time steps? In other words the first N steps of each new sequence would not generate meaningful likelihoods.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1954#issuecomment-87089444.

With kind regards,

David Ray Java Solutions Architect

cortical.io http://cortical.io/ Sponsor of: HTM.java https://github.com/numenta/htm.java

d.ray@cortical.io http://cortical.io

breznak commented 9 years ago

@subutai @rhyolight I'm going to undust the issue a bit. Is CLAModel.resetSequenceState() the function that is hooked to the r (reset) signal in OPF? https://github.com/numenta/nupic/blob/master/nupic/frameworks/opf/clamodel.py#L238

Should we implement reset() calls for Anomaly & Likelihood code? And the MovingAverage..does it make sense to reset, or it should carry on over the mark?

rhyolight commented 9 years ago

Is CLAModel.resetSequenceState() the function that is hooked to the r (reset) signal in OPF?

From what I understand, that should be the case.

Should we implement reset() calls for Anomaly & Likelihood code?

I'd rather see an optional parameter resetAnomaly=True where the default value is False. That would keep the sam behavior and allow an override to reset it all.

model.resetSequenceState(resetAnomaly=True)

scottpurdy commented 9 years ago

This is a little long so I will break it into sections:

Ok to take a reset but we should make anomaly implementations subclasses of the base Anomaly so this is more manageable.
We should move the anomaly class to the anomaly region. I'd prefer the get the subclassing right first though. The advantage of a region is that you can just link the reset from the RecordSensor to the AnomalyRegion.
(The following is a general thought, this change isn't very invasive so I'm not too worried about it in particular) I am a little worried that we are spending a lot of time adding functionality that won't ever be used by anyone other than @breznak. In this case, for instance, why would you ever prefer to reset the anomaly likelihood? If the historical anomaly scores are no longer valid, resetting doesn't actually help since the likelihood values will still be useless until you get enough new data. And in most cases, you will probably be better off using the pre-reset values than having no history at all. But bottom line is that you can always handle this in your application code however you want.
I don't think #2267 is the right solution. It introduces a burn in period, which is redundant if you are using the anomaly likelihood and you can do it in your application code. Instead, the Anomaly subclasses should just have a reset() method that they can handle however makes sense for that anomaly mode.

subutai commented 9 years ago

I generally agree with @scottpurdy . A reset is fine to include, but if it is a completely new sequence it should probably restart the whole anomaly likelihood model. This will then require a few hundred samples to re-estimate properly. I suppose you could consider keeping the likelihood model around but the next sequence should be one with almost the same statistics. In either situation the moving average should definitely be reset.

breznak commented 9 years ago

Hey @scottpurdy , thanks! some feedback

Ok to take a reset but we should make anomaly implementations subclasses of the base Anomaly so this is more manageable.

https://github.com/numenta/nupic/issues/2287

We should move the anomaly class to the anomaly region. I'd prefer the get the subclassing right first though. The advantage of a region is that you can just link the reset from the RecordSensor to the AnomalyRegion.

there's work in https://github.com/numenta/nupic/issues/2073 but I got stuck there

(The following is a general thought, this change isn't very invasive so I'm not too worried about it in particular) I am a little worried that we are spending a lot of time adding functionality that won't ever be used by anyone other than @breznak.

It's perfectly OK if some my ideas/needs are not useful generally and won't be accepted upstream (just hope it wont be too much/easy to maintain in parallel )

In this case, for instance, why would you ever prefer to reset the anomaly likelihood? If the historical anomaly scores are no longer valid, resetting doesn't actually help since the likelihood values will still be useless until you get enough new data. And in most cases, you will probably be better off using the pre-reset values than having no history at all. But bottom line is that you can always handle this in your application code however you want.

I think this is another case, maybe not that important, but applicable to everyone.

can I rename this issue to "What HTM components need to react to reset signal?"
I'm not sure where it's meaningful (to reset), that's why I'm asking.

historical anomaly scores are no longer valid, resetting doesn't actually help since the likelihood values will still be useless until you get enough new data.

What if the buffer is bigger than the sequence?

I don't think #2267 is the right solution. It introduces a burn in period, which is redundant if you are using the anomaly likelihood and you can do it in your application code. Instead, the Anomaly subclasses should just have a reset() method that they can handle however makes sense for that anomaly mode.

The code does use anomaly.reset() to handle what needs to be done, or how do you mean it? Do you think isReady() is useful to signal a component should provide reasonable numbers?

breznak commented 9 years ago

Thanks @subutai

generally agree with @scottpurdy . A reset is fine to include, but if it is a completely new sequence it should probably restart the whole anomaly likelihood model. This will then require a few hundred samples to re-estimate properly. I suppose you could consider keeping the likelihood model around but the next sequence should be one with almost the same statistics.

You provide both choices :wink: hard-reset likelihood instance, or keep looking at the old model. And I think this cannot be decided - is the new sequence completely new? Or similar to the seen ones?

a (swarming) parameter for this?
we could quickly evaluate - if the first few predictions are OK - CLA has seen the sequence -> keep history, else start fresh.

In either situation the moving average should definitely be reset.

adding to TODO, thx!

scottpurdy commented 9 years ago

Generally sounds fine, @breznak - thanks for the follow ups. I don't think we should add a new reset but each anomaly implementation could choose how to interpret it (or have a parameter if necessary).

cogmission commented 9 years ago

I don't want to intrude but I was reading over some comments and I just wanted to respectfully interject that in general it is a really bad practice for code to have "personality". What I mean is, configurations that mean one thing in one place and something else in another - or the same state or condition being "interpreted" differently across different implementations within the same body or module of code/code files...

The user in the above case then needs to "acquire" and memorize conditions for each code/module sharing some common (but differently interpreted) state. Requiring devs to have "insider knowledge" violates conceptual encapsulation and also is kind of a pain in the butt... :-) "Secret incantations" generally suck to have to deal with... Just an opinion guys...

I know you guys know this, but it bares restating sometimes; people should know what a given condition means with all Numenta code; we should try to "squash" all surprises as a design ethic, I would say?

scottpurdy commented 9 years ago

@cogmission - I completely agree with what you are saying. But I don't think it differs from my suggestion. A reset is always a user-defined action. Sometimes it means there was a gap in the data, sometimes it means the source of the data changed. We can't anticipate all of the possibilities so we have to instead provide functionality that gives flexibility to the user. As such, how you link a reset up in a network is necessarily tied to what the reset means for that particular application. There is no magic here, the user must explicitly define the network. And so if there are multiple ways to use a reset in the anomaly implementations then I think it is totally fine to have the user explicitly say which one they want, whether that is by choosing which anomaly implementation or by passing a parameter to one of the implementations.

But of course we would only want to add that option if there really are multiple options that will be commonly used and valuable. I'm not really sure if that is the case or not. If the user links the reset up then they probably want at least the moving average reset as @subutai suggests. I'm not sure if they would want the historical distribution reset or not.

@cogmission - Does that differ from what you are saying?

cogmission commented 9 years ago

You're right. I don't think we're saying completely different things necessarily.

I think regardless of the conditions surrounding the call to "reset" or the changes in data a reset may reflect, I think the actual "reset method" should always do the same thing. If it does something similar (but not exactly the same) in one code module than another than it (the method) should have a different name in my opinion? When I as a developer call reset on a body of code, I shouldn't have to consider which algorithm I'm referencing to understand what it does. The algorithm shouldn't have any bearing on whether I call reset or not (as opposed to some other method).

So to summarize: I'm thinking Reset should only do one thing wherever there is a method named "Reset" in our code. If we need slightly different behavior than we should maybe try and come up with a different name to represent that and only have it do that one thing across all code modules? How does sound?

As you know HTM technology is a more configuration and parameter heavy entity and its paradigm less understood and has a very steep learning curve. I'm sure you agree that we should try and make the software as transparent and simple as possible because the concepts surrounding its use are only going to grow in complexity as the software evolves.

Edit: I'm speaking more in general than in reference to any specific code really, Scott. I haven't reviewed any code or anything - I'm just making comments on the approach I think we should gravitate toward.