htm-community / htm.core

Actively developed Hierarchical Temporal Memory (HTM) community fork (continuation) of NuPIC. Implementation for C++ and Python
http://numenta.org
GNU Affero General Public License v3.0
147 stars 74 forks source link

Model constantly grows in size, implement removing synapses, segments; death #461

Open breznak opened 5 years ago

breznak commented 5 years ago

I ran into an interesting read from Subutai:

In addition, you will find that the model size grows slowly over time. This is because the HTM always adds new synapses. To counteract this, I’ve speculated for a while that you might be able to randomly downsample the number of synapses once you have trained on a few thousand records. Here’s what I would try first: on each iteration keep track of the number of synapses you add (max is newSynapseCount * number of active columns on any given iteration). After the compute step, randomly choose that many synapses throughout the network, and delete them. If a segment becomes empty, delete the whole segment. You might need to add some methods in the class to support this operation. With all of the above, test with the full NAB benchmark to ensure the changes are not too disruptive to performance. This is slow, but the best quantitative way we know of to ensure decent anomaly detection accuracy. You’ll want to make sure you understand our anomaly detection paper, NAB codebase, and the datasets used in NAB. I think NAB is a valuable tool in debugging. I have not tried the above, so these are my best guesses. I would be very curious to see what you find out!! Of course, doing the above will speed up the serialization process as well. There are also many code optimization strategies that have not been implemented and can work (e.g. using smaller bitsize floating point instead of 64 bit, or going down to 8 bit integer math), but that would be more work. (A quick version of this might be to just change the Proto bit sizes. This will lead to a situation where you won’t get identical results after deserialization, but it might be “good enough”. I don’t know enough about Proto to know whether this will really work.) ~Subutai https://discourse.numenta.org/t/htm-anomaly-detection-model-too-large-for-streaming-analysis/4245/7?u=breznak

I think the issue has been discussed here.

To experiment with this:

breznak commented 5 years ago

CC @ctrl-z-9000-times

ctrl-z-9000-times commented 5 years ago

For me this is a low priority because RAM is historically inexpensive. I think that the CPU is more often the bottleneck.

breznak commented 5 years ago

A quick update:

For me this is a low priority because RAM is historically inexpensive. I think that the CPU is more often the bottleneck.

in that thread HDD/model size was a practical problem. Also, we never had a model running for example one year continuously.

I'm also not interested in the practical limitations, but

ctrl-z-9000-times commented 5 years ago

Spatial pooler calls only createSegment, never releases it. See if that's correct.

Yes this is correct. At initialization the Spatial Pooler creates all its synapses and segments. At run time it neither create nor destroys any segments/synapses. At run time it has a fixed memory usage.

breznak commented 4 years ago

Part of this could be addressed in #466 SP using synapse pruning & syn competition PR #584

Qubitza commented 3 years ago

Hi together,

I'm running a model over 13 trillion data samples to simulate a streaming analysis. After about 3% of the data, the model used up my whole memory and crashes at tm.compute(...) with a MemoryError: bad allocation.

I've looked in the HTM Forum and I came across this issue, but it seems like it's still an open task.

Have you implemented the synapse drop out in some way? Do you have any experience with HTM streaming analysis?

Thanks in advance.

dkeeney commented 3 years ago

@N3rv0us thanks for reporting this.
I think this is one that @breznak or @ctrl-z-9000-times would have to address.

breznak commented 3 years ago

Hi @N3rv0us ,

running a model over 13 trillion data samples

very interesting data size! yes, this is a known issue, we've been running "online learning" HTMs but I never approached these sizes.

Have you implemented the synapse drop out in some way?

there's a param as synaptic decay or alike. But it never prunes the actual space in mem. There are recently merge/in progress PRs on synapse pruning, that should let you prune the mem. I'd say default is OFF.

used up my whole memory and crashes

even with the pruning off, the model should use up all of its avail synapses/mem and then start reusing it - the crash is unexpected.

and crashes at tm.compute(...)

ideally if you can a stack trace with crash in Debug mode, but I guess it's pracically hard to get.

Can you try emulating the problem by setting low limits on num synapses, num segments, ... This will cause your model to use up its resources much sooner. - if it crashes - problem is in our code and is easier to replicate early.

If it does not crash, you can have just too large model and your HW (RAM) cannot satisfy that. Then you'd need to lower the settings for your model.

I'll be happy to look at this later next week.

breznak commented 3 years ago

segment and synapse pruning is implemented in #601 for Connections, therefore available to any of our algorithms (SP,TM). Apparently it is not being used by default as it causes our strict deterministic runs to fail in synapse pruning in SP.

If you can live with that, please try enabling both synapse & segment pruning. As a follow up, we should fix the determinism with synapses pruning and set it to on by default.