htm-community / htm.core

Actively developed Hierarchical Temporal Memory (HTM) community fork (continuation) of NuPIC. Implementation for C++ and Python
http://numenta.org
GNU Affero General Public License v3.0
148 stars 74 forks source link

New Spatial Pooler with Macro columns for faster Local inhibition #86

Open breznak opened 5 years ago

breznak commented 5 years ago

Performance improvements for Spatial Pooler Topology / Local Inhibition. I mentioned this on the numenta.org forum, and here I'm hoping to flesh out the idea more and communicate it with you all.

The spatial pooler with global inhibition works great as is; however local inhibition does not scale well because of the algorithms used. The differences between local and global inhibition happen at a large scale, but within a small (topological) area local and global inhibition do the same thing. Poor mans topology uses global inhibition to approximate local inhibition by making a spatial pooler with global inhibition for each area of local inhibition. In other words: Macro columns can use global inhibition and still have a large scale topology, by simulating a macro column for each topological area.

Pros:

Cons:

Implementation: A new C++ class which will create and maintain a list of SpatialPooler instances, one for each macro column. Macro columns are arranged in a uniform grid over the input space. Macro columns inputs are rectangular slices of the input space.

Example: The MNIST dataset would be a good example. Its fast, easy to solve, widely recognized, and its visual data which is pretty.

API: Similar to SpatialPooler class ...

...


Started a separate issue from https://github.com/htm-community/nupic.cpp/issues/3#issuecomment-437239753 Related #84 Author @ctrl-z-9000-times

breznak commented 5 years ago

Interesting concept, just first thoughts before I think this through:

ctrl-z-9000-times commented 5 years ago

Regarding speed, we have SP in CUDA: htm-community/htm.cuda#2 Would this suffice your need? Is your requirement solely about the speed, or the concept of macro columns?

Speed. The goal of this endeavour is to get an algorithmic performance improvement when using spatial poolers with topology. The current implementation of SP topology is slow because it gives every miniColumn its own unique neighbourhood for inhibition. Calculating the inhibition in each neighbourhood is time-consuming, but if you assume that nearby miniColumns have the same neighbourhood for inhibition then you have many few neighbourhoods to compute. Also you can use faster sorting-based methods to compute these neighbourhoods.

for Implementation, if possible, I'd like to see an SPMacroColumns that doesn't need any special macroXXX methods, just executes a pool of SPs and arranges their inputs,outputs to form the result.

That's the goal. The pool of SPs are vanilla flavored. Method 'compute' will have the same API as for the SP, and method 'initialize' will have only some minor changes.

about the idea: I'm still not sure about the concept, doesn't the reduced SP-field violate the "distributed" feature of SDRs? (as you've mentioned)

The defining characteristic of SP w/ topology is that the output SDR has a fixed sparsity in each topological area. Without topology there is a fixed sparsity across the whole cortical area (which could include many topological areas). So yes, this does add a constraint on the possible SDRs. But within a single topological area activity is still distributed among a large number of mini-columns.

breznak commented 5 years ago

performance improvement when using spatial poolers with topology.

About the topology, I had 2 ideas in my backlog:

About the concept of macro columns: I call each area a macro-column but really it is an approximation of a continuous thing - a sheet of cortical cells.

There was an unfinished PR in nupic python, separating topology (1D, 2D, nD) and Inhibition (Local, Global, InhibitionInterface) into separate classes. If this would help you, we should revive it for cpp. I think you could model your macro-column just by creating a special case of topology&inhibition. This would allow many experiments with SP inhibition much more easily.

But within a single topological area activity is still distributed among a large number of mini-columns.

This is where I suspect the theory (topological areas with sparsity on mini columns) and implementation (to speed things up) might collide. SP also needs a reasonable large number of columns to form SDRs. For speedup, I thought a macro column would comprise of 10s of cols. This could be a problem. I think current SP of 2048 cols would be your macro-column, so this approach would speed things up

TL;DR

ctrl-z-9000-times commented 5 years ago

I think you misunderstand my intentions. My goal with this work is to take an existing technology (SP w/ global inhib) and to scale it up to the point where it can do useful work, such as to model the entirety of cortical area V1. I'm not interested in experimenting with either the inhibition or the code which performs it, since I've already done some tests and I've found that this method works enough to merit further investigation. To be clear, I welcome your experiments w/ inhibition and when you finish them we should compare results!

breznak commented 5 years ago

to scale it up to the point where it can do useful work, such as to model the entirety of cortical area

ok, now I get it. So you want to create a huge SP (~V1) comprised of macro-columns(~"normal" SPs of size approx 2048?)

ctrl-z-9000-times commented 5 years ago

That's correct.

ctrl-z-9000-times commented 5 years ago

Update: i implemented what I described and tested it on the MNIST data set. It scores about 75% whereas my previous implementation of it score 95% with the same parameters. So im going to put this on the back burner for a while. I still hope to fix this but I dont have a plan to fix it in the near future.

breznak commented 5 years ago

It scores about 75% whereas my previous implementation of it score 95% with the same parameters.

Thanks for sharing the results! Do you want to share the branch with the macro-columns SP? Just for futre reference. And what was your "prev implementation"?

ctrl-z-9000-times commented 5 years ago

My previous implementation of this concept is at: https://github.com/ctrl-z-9000-times/sdr_algorithms The macro columns are incorporated directly into the SpatialPooler class. If you manage to get it to build (it uses Cython...), then you can run it as: $ ./func_tests/mnist_sp.py

ctrl-z-9000-times commented 5 years ago

My implementation now scores 80% accuracy. It is a work in progress. You can see it at: https://github.com/ctrl-z-9000-times/nupic.cpp/tree/mnist

Executable: mnist_sp Note: Always run this in release mode. Debug mode takes a lot longer.

ctrl-z-9000-times commented 5 years ago

Update: It now scores 93%-94% accuracy on the MNIST dataset. It's not ready to merge: the code is a mess, SP API changed, SP unit tests fail, is slow.

Before running this you must download and extract the mnist dataset into the directory /build/Release/bin/mnist_data/* . I think that this is something CMAKE can do.

It is located in this repo on branch "mnist".

breznak commented 5 years ago

I read through all the discussion again in order to analyze if/how the new local inhibition is limited? It is something similar we faced with @mirgee in CUDA implementation. In CUDA, we approximated local inhibition by setting a large number of sub-regions that do global inhibition. Is this exactly the same approach? Or can MacroCols move the center (,and diameter?) of the neighborhood? In other worlds, is the segmentation to sub-areas given from start, or is it learned?

ctrl-z-9000-times commented 5 years ago

approximated local inhibition by setting a large number of sub-regions that do global inhibition. Is this exactly the same approach?

I think it is.

is the segmentation to sub-areas given from start, or is it learned?

The areas (macroColumns) are defined from the start.

breznak commented 5 years ago

The areas (macroColumns) are defined from the start.

not in your PR, but it should be eventually possible to make these centroids learned, right? (if we can move the centers, and diameters) we are (better) than the current local inhibition. This approach is a bit similar to convolution in CNNs.

How do we evaluate the quality of the inhibition -> SDRs? MNIST, hotgym and the Metrics?

ctrl-z-9000-times commented 5 years ago

The center & diameter of macro-columns are only used to determine potential pool, nothing else. In order to change the center or diameter of a macro-columns territory, you'd need to change the potential pool, and then learn new weights for new connections. I'm not sure what benefit it would have. Remember that the cortical circuit is already supposed to form a hierarchy which covers every input area at every scale, that part just isn't implemented yet.

You could make more like a CNN by sweeping the input sensor across the input space, similar to how an eye moves across a page of text.

ctrl-z-9000-times commented 5 years ago

How do we evaluate the quality of the inhibition -> SDRs? MNIST, hotgym and the Metrics?

Yes. Measure SDRs in all of the same ways as without this special inhibition code. Measure statistics about the SDR and check its performance on real world classification & anomaly tasks.

breznak commented 5 years ago

The center & diameter of macro-columns are only used to determine potential pool, nothing else. In order to change the center or diameter of a macro-columns territory, you'd need to change the potential pool, and then learn new weights for new connections.

I thoght a macro-col in context of this PR is a set of mini columns that share (fixed) same neighborhood (are all neighbours to each other). Shifting would mean adding/removing some new cols from the neighborhood. For example, if starting macro-columns (with losers (-) and winners (w) and better winners (W) (num win counts W >> w >> -) ) looked like this:

<-------------W-><-----------------w-->

we might eventually shift and resize the hood areas as:

<----------W-----------><----w---->

This would only affect what columns compete with which for the winner-takes-all in sub-area global inhibition. Not the weights of the column (that would have to be re-learned)

ctrl-z-9000-times commented 5 years ago

That diagram should look like:

<     -     ><     -     >
      W            -
      -            w
      -            -
      -            -
      -            -
      -            -

Within each macro-column the mini-columns are all located at the center and have the same territory within the input space. They're not evenly distributed across the input space, but rather they're clumped together. Because they're all stacked on top of a single point, they have the same potential inputs and so they compete on an even playing field. If they were spread out then the inhibition would be much more complicated. The inhib in this PR is simple, which is an advantage.

breznak commented 5 years ago

See https://github.com/htm-community/nupic.cpp/pull/242#issuecomment-467812798 for recent problems with local inh performance on MNIST