htm-community / htm.core

Actively developed Hierarchical Temporal Memory (HTM) community fork (continuation) of NuPIC. Implementation for C++ and Python
http://numenta.org
GNU Affero General Public License v3.0
151 stars 75 forks source link

Real-life benchmark: Hotgym example using C++ algorithms #30

Open breznak opened 6 years ago

breznak commented 6 years ago

Implement a pipeline running full real-world HTM task.

Currently implemented using raw HTM classes (TM, SP,...), not NetworkAPI (needs TM/SPRegion), not as python code using c++ bindings (would be possible).

Pipeline:


We are looking for a real-life benchmark we can use as a base for our performance optimizations #3 . In Python there is a "Hotgym anomaly example" (stresses encoder, SP, TM, Anomaly) , implement similar example in C++ and add it to integration-tests with timing.

I have ported SPRegion and TMRegion and they run under windows but I was waiting until after PyBind was implemented to merge them in with a later PR.

Waiting for #54 SPRegion & TMRegion in C++

breznak commented 5 years ago

@ctrl-z-9000-times @dkeeney Please hold off merging PRs before we implement this (should be soon, by Mon)

Any of you good with NetworkAPI? I need a "hello world" example, where we create a Network with all the basic parts: encoder, SP, TM, Anomaly. And run it through hotgym dataset and measure time. It should be relatively simple, is there any example constructing network?

Then we can proceed merging: SDR, Random, Eigen PRs Thanks! :+1:

dkeeney commented 5 years ago

The CppRegionTests.cpp has some of the parts, but SP and TM cannot be implemented in C++ only until we complete the SPRegion and TMRegion classes in C++. Currently there is Python code that fills in the gap. I have ported SPRegion and TMRegion and they run under windows but I was waiting until after PyBind was implemented to merge them in with a later PR.

I think a hotgym example/tests would be a good addition to unit_tests.

ctrl-z-9000-times commented 5 years ago

Sorry, I can't help with the network API. I don't use the network API code, it doesn't compile so I comment it out from CMakeLists.txt (boost is playing hide and seek w/ cmake).

To benchmark the SpatialPooler and SDR-Classifier, MNIST is a good hello-world type dataset. It should take 10-20 minutes to run through the whole dataset, assuming you compiled for release mode. Debug mode takes approx 10 times longer. The repo "Numenta/nupic.vision" has an example of MNIST which works, but it will need to be updated to work with this fork. I wrote my own solution to the MNIST dataset but I think there are bugs in it still.

dkeeney commented 5 years ago

Our whole objective is to get the network API to build (with boost until we can use C++17) so that it is usable. It is the framework in which all of the algorithms can be coordinated.

The hot Gym example is Python code. In order to port that to a C++ example/test so we can use it as a performance tests we will need to port not only the hot Gym example but also complete the C++ code set by porting the Python code SPRegion and TMRegion modules to C++ so that the SP and TM can be executed as C++ plugins rather than as Python plugins.

breznak commented 5 years ago

have ported SPRegion and TMRegion and they run under windows but I was waiting until after PyBind was implemented to merge them in with a later PR.

Thanks David, so I have the options:

To benchmark the SpatialPooler and SDR-Classifier, MNIST is a good hello-world type dataset. It should take 10-20 minutes to run through the whole dataset

That would be a good example, thanks! We can even add TM (would be useless, but we want to stress it). I will use MNIST as addition to hotgym

(please resolve the boost build issue, but in a new thread)

need to port not only the hot Gym example but also complete the C++ code set by porting the Python code SPRegion and TMRegion modules to C++ so that the SP and TM can be executed as C++ plugins

Still, I can now write example in Py that uses cpp impl, so our c++ code, right?

ctrl-z-9000-times commented 5 years ago

build (with boost until we can use C++17)

Thanks David! I updated to gcc-8 and c++17 and now it builds much cleaner.

breznak commented 5 years ago

@dkeeney @ctrl-z-9000-times should we keep both TP, TM in this example,benchmark, or go only with TM, as most code does?

dkeeney commented 5 years ago

Lets keep both because this tests the C++ to Python and Python to C++ interfaces for both algorithms.