numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.33k stars 1.56k forks source link

fieldType List Not supported in swarming? #2545

Closed RDaneelOlivav closed 9 years ago

RDaneelOlivav commented 9 years ago

Are lists still not supported? Does anyone know a workaround?

"includedFields":[
    {
      "fieldName": "fft_R",
      "fieldType": "list"
    }

Is this wrong in a .json file?

rhyolight commented 9 years ago

@scottpurdy I'm not familiar with the list input type. What is your input on this?

scottpurdy commented 9 years ago

The list type was a recent addition. It is implemented specifically for category fields. These are used by the classifier and not included in the encoded representation. I could see this being extended for inclusion in encodings. Someone could revert this PR and clean up the code to work with the list type:

https://github.com/numenta/nupic/pull/1481

rhyolight commented 9 years ago

@RDaneelOlivav Before we create a ticket for new work to add back a VectorEncoder, I want to make sure it will solve your problem. Can you describe the data you are trying to use with NuPIC?

RDaneelOlivav commented 9 years ago

Sure ;). The data that I am trying to use is a FFT ( Fast Fourier Transfrom) of people speaking with various entonations in different languages Trying to detect the different basic entonations ( like sad, happy , angry and neutral ) For what I've understood, in nupic.critic what you do is made models for each frequency band, don't you? You dont combine all the Frequency "finguerprint". So I though that maybe I could use the nupic encoders like so: [15, 5, 3] -->1553 --> SCalarEncoder(1553) --> 010...01, and input this in as input Data. Or maybe [15, 5, 3] --> SCE(15) join SCE(5) join SCE(3) --> 010111 100111 011110 ... MultiEncoders maybe?

Any alternatives or you think its the way to proceed?

Sorry for the late response guys.

rhyolight commented 9 years ago

[15, 5, 3] -->1553 --> SCalarEncoder(1553) --> 010...01

I don't think this is the right approach, because the context of the individual frequency band counts is lost when [15, 5, 3] is squeezed into 1553. For instance, if 3 changed to 30, the number goes from 1553 to 15530, which represents a huge distance from 1553, when actually two of the values didn't change. The nature of SDRs means that your data will be misrepresented here.

Because I've never used a VectorEncoder, I'm not sure exactly why it was ever used. @chetan51 or @scottpurdy can you enlighten me and add you opinion?

rhyolight commented 9 years ago

I think they typical way to go about this is to create a model with one field for each frequency band. I believe I tried this at first in nupic.critic, but I didn't know which band should be the "predicted field". So I just created one model for each band, only using that band's data and getting anomalies from each model only for the frequency band it was observing.

RDaneelOlivav commented 9 years ago

I see the point there @rhyolight, it changes too much. This morning I've been experimenting with multiencoders, that would give that effect too? I'm giving in the "fieldEncodings" one for each frequency batch and it will output a sigle SDR with all those combined.

{Pseudo Code}

for freq in frequency_array:

    value = dict(fieldname=freq_name, type='ScalarEncoder', w=3, resolution=1, minval=1, maxval=24, name=data_name, forced=True)

    fieldEncodings.update({freq_name: value})

e = MultiEncoder(fieldEncodings) d = data_input output = e.encode(d)

What I don't really understand with doing multiple models is that you just see anomalies in the prediction of each frequency batch alone, but you wont learn higher level interactions between the different frequencies. You see, I'm more interested in learning the sound spectrum as a whole, and that we can predict and differenciate different frequency profiles lets say. Or is there a way of doing so by modeling separate models and afterwards doing another global model?

Many questions but I'm sure that many of us have this doubts too ;)

rhyolight commented 9 years ago

I don't know much about the MultiEncoder, so I just asked some questions about it.

From your description of what you want above, you probably need hierarchy, which we are not quite working on yet. The research focus is sensorimotor, feedback, and 3rd generation temporal pooling, which will help enable hierarchy in the future. See here for an explanation, but Jeff Hawkins will also be talking about temporal pooling at the HTM Challenge Onsite. So more info will be online after that event. This is an active research topic.

rhyolight commented 9 years ago

@RDaneelOlivav All that being said, I don't think that this issue definition will actually get you what you want, so I will close it out. Please keep an eye out for future work that will enable a hierarchy. We are all looking forward to it.