How to use CLA Classifiers

RDaneelOlivav commented 9 years ago

Hi,

I'm having problems useing Temporal Poler because I though that with them I could know which sequence the data is in. Example: if I have a structure made up of two levels, each one with a SP and TP, with the output of Level 1 being the input of level 2, if I feed the sequence ABCD ( sequence 1 ) and BCAD ( sequence 2 ) how does ot work?. I mean the first level will convert each letter into SDRs, feed them into the SP and then into the TP. The SP will give as a compact representation and the TP the prediction of those compact representations. But how can I detect that if I get ABC, I'm seeing the first sequence and not the second, and therefore I'm able to feed the second level the following:

input is A(t=0),B(t=1),C(t=2, current time) and I get Sequence1 which I feed to the Level2 in the herarchy. Then level2 will learn sequences of those sequences ( s1,s2,s2,s1 --> Sequence1LVL2 , s1,s1,s1 --> SequenceLVL2 ) and so on.

I've found https://github.com/numenta/nupic/wiki/Temporal-classification Which talks about converting SDR in a single sequence representation, and seeing the example examples/tp/hello_tp.py, I cant see yet where it outputs the sequence. Any ideas? Do "segments" have something to do with it?

passiweinberger commented 9 years ago

Sounds like a problem for the category prediction function of nupic?

For reference see the example here: https://github.com/numenta/nupic/tree/master/examples/prediction/category_prediction

RDaneelOlivav commented 9 years ago

Hmmm... I've run it but I fail to see where has it assigned a sequence of words to the same sequence. Is each row in the results.csv a diferent sequence? And what does the number beside it mean? Any clarification?

What I don't understand is, if the TP is able to give us the next patern, why isn't there any variable that indicates which sequence is more probable that the current pattern is in? Shouldn't it accesible?

I have found this email which may give some insight, but I don't see if it was resolved the issue at the end: http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2014-January/007193.html

arhik commented 9 years ago

A informational video on CLA Classifier(https://www.youtube.com/watch?v=QZBtaP_gcn0) by Subutai Ahmad will help you more incase you didn't know and this resource(https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CCYQFjAAahUKEwiVxJORyPnIAhWKPCYKHdypCPc&url=http://www.dfki.de/web/forschung/km/publikationen/renameFileForDownload?filename=DiplomaThesis.pdf&file_id=uploads_1055&usg=AFQjCNEkGldczynzaFPU1ASURgQod96ZVw&sig2=ghti5aeCmv-Sfb5hpVI6Hw) will help you in case you are ok with math behind it.

As far as i know NuPIC CLA Classifer predicts what likely category would occur in k steps away from present. Some extra work should be done on top of it for sequence prediction. This extra work could turn a bit complex if the sequence length is not known or predetermined. May be thats where you need hierarchy. The lower order sequences probabililty distribution is used higher up in the hierarchy.(I think of a way to classify a sequence in one level but its by hard coded means. I could be wrong and it could be an overkill.)

Outline: Classification can be done on one level if sequence length is predetermined. Otherwise you need hierarchy for variable order sequences. May be i am repeating what you know already. Sorry if thats the case. But its a bit involved concept and takes time to explain.

Note: I am not reliable. You have to wait for some one to correct me if i am wrong.

On Tue, Nov 3, 2015 at 6:01 AM, RDaneelOlivav notifications@github.com wrote:

Hi,

I'm having problems useing Temporal Poler because I though thet with them I could know which sequence the data is in. Example: if I have a structure made up of two levels, each one with a SP and TP, with the output of Level 1 being the input of level 2, if I feed the sequence ABCD ( sequence 1 ) and BCAD ( sequence 2 ) how does ot work?. I mean the first level will convert each letter into SDRs, feed them into the SP and then into the TP. The SP will give as a compact representation and the TP the prediction of those compact representations. But how can I detect that if I get ABC, I'm seeing the first sequence and not the second, and therefore I'm able to feed the second level the following:

input is A(t=0),B(t=1),C(t=2, current time) and I get Sequence1 which I feed to the Level2 in the herarchy. Then level2 will learn sequences of those sequences ( s1,s2,s2,s1 --> Sequence1LVL2 , s1,s1,s1 --> SequenceLVL2 ) and so on.

— Reply to this email directly or view it on GitHub.

RDaneelOlivav commented 9 years ago

Thanks @arhik and @passiweinberger for the fast responses.

Yeah, I know that Numenta Guys are doing their best and have no doubt that they'll give a great response, because they have in the past and helped a lot, which I really appreciate and hope to return them somehow ;).

I already saw the video that you are telling me but it was only the concepts behind, not the inplementation in NUPIC which I'm more intereseted in.

As for the CLA classifier I haven't used it yet. But I'm a bit confused on what's the difference between "category" and "sequence" . What do you mean by category?

I don't intend to get sequences without fixed number of elements, but anyway it could be worked around by iteratively see which sequences are each patern introdeced depending on the number of elements specified. How would you get the sequence which a patern belongs to with the tools that we have now? Because for what I know it could be done by Agglomerative Hierarchical Clustering, which I though it would be implemented in NUPIC basic Hierarchical structure.

I think this discusion is really interesting because this is the only piece of the puzzle left for me and sure for a lot of NUPIC users.

Any ideas people?

arhik commented 8 years ago

On Thu, Nov 5, 2015 at 12:05 PM, RDaneelOlivav notifications@github.com wrote:

As for the CLA classifier I haven't used it yet. But I'm a bit confused on what's the difference between "category" and "sequence" . What do you mean by category

Thank you for understanding.

Category is kind of encoder we pre-assumed you would use. No overlapping bits would lead to category. A sequence element for your case would need category encoder. Similar category encoding in inevitable for a higher order sequences (you can just concatenate SDRs but could lead to explosion). I am a bit reluctant to point to the same video when you have seen it already but it holds many secrets (intenal working) of NUPIC (CLA - classifier atleast). May be you skipped a lot searching for a specific part. Its an implementation video too (Thats how CLA - classifier works). Please dont take anything to heart but i am more interesting in helping you.

By the way i had similar questions, I thought nupic can do everything at one point. The framework can do anything we can do but at basic level. The higher objectives should be constructed from the framework. I digged through its mathematical details too. There is a paper by Hawkins and George (I dont remember the title - but my rough guess is "Mathematics behind cortical micro circuits" something like that). I am pointing this because few tend to think traditional way or something they know already. I have no idea what AHC is but its better to think fresh for HTM to quickly grasp it. Other way to deal with it is to think more like hierarchical bayesian network with built in compression mechanisms like reuse the basic structure higher up. Trust me you dont need mathematical details too unless you are working on big project. This will save you time.

I will work on it and let you know if i had a lead. But i cant guarantee if i get to work on it soon and i will be successful.

arhik commented 8 years ago

Just took a took a look at agglomerative hierarchical clustering. I think its doable. Let me think and comment on it. Now i understood what you are expecting. Sorry if i was completely irrelevant. 'How to use CLA' subject is irrelevant now if you are okay with our responses regarding CLA - You can raise new question with subject title 'sequence classification' so that core team will help you from here.

On Thu, Nov 5, 2015 at 12:05 PM, RDaneelOlivav notifications@github.com wrote:

Thanks @arhik and @passiweinberger for the fast responses. I alreday saw the video that you are telling me but it was only the concepts behind, not the inplementation in NUPIC which I'm more intereseted in.

As for the CLA classifier I haven't used it yet. But I'm a bit confused on what's the difference between "category" and "sequence" . What do you mean by category?

I don't intend to get sequences without fixed number of elements, but anyway it could be worked around by iteratively see which sequences are each patern introdeced depending on the number of elements specified. How would you get the sequence which a patern belongs to with the tools that we have now? Because for what I know it could be done by Agglomerative Hierarchical Clustering, which I though it would be implemented in NUPIC basic Hierarchical structure.

I think this discusion is really interesting because this is the only piece of the puzzle left for me and sure for a lot of NUPIC users.

Any ideas people?

— Reply to this email directly or view it on GitHub.

RDaneelOlivav commented 8 years ago

Thanks @arhik . Now you understand right? Yeah I'll post a new issue with the title "sequence classification" to dig in on the mater. I'm working right now on it and I'll post anything I get on it but I encourage anyone to do so also, because its very interesting matter and I hope once I finish my project to upload all of it here to thank this great community ;).

rhyolight commented 8 years ago

@scottpurdy / @chetan51 / @subutai ?

chetan51 commented 8 years ago

Conversation moved to https://github.com/numenta/nupic/issues/2742#issuecomment-157202924.

numenta / nupic-legacy

How to use CLA Classifiers #2727