Pure Python version of Spatial Pooler

numenta / nupic.research

Experimental algorithms. Unsupported.

https://nupicresearch.readthedocs.io

GNU Affero General Public License v3.0

107 stars 60 forks source link

Pure Python version of Spatial Pooler #632

Closed abhi-iyer closed 2 years ago

abhi-iyer commented 2 years ago

Pure Python 3 version of Spatial Pooler, no C++ dependencies. Cleans up a lot of the old spatial pooler code (removes unused methods, consolidates helper methods, fixes unclear logic, adds comments where ambiguous logic is used, etc.) and makes it more readable.

Passes all tests.

Edited some tests to match new seed values and new expected outputs.
Skipped test cases where wrapAround=True is used. For example, skipped testUpdateBoostFactors because wrapAround=True is the default setting for SP.
Skipped testIsUpdateRound and testUpdateDutyCycleHelper because the contents of the test are already covered in other tests. Additionally, the helper methods that these tests cover are no longer standalone methods.

subutai commented 2 years ago

Thanks! Quick question for @lscheinkman: should we create a whole new package just for the spatial pooler? Or should it be combined with the temporal memory code?

abhi-iyer commented 2 years ago

We were thinking of keeping it as a new package, since the temporal memory code depends on building the nupic.research.core library. (Since this is Python only, it's not dependent on those C++ implementations.)

lscheinkman commented 2 years ago

Ideally that would be the case, but as @abhi-iyer mentioned keeping as a separate package gives us more flexibility because it does not depend on anything other than nupic.research. By putting in a separate package it also enable us to install it with our default conda env.

subutai commented 2 years ago

I see two issues with this:

It's really strange to create a package for a single file that has no additional requirements! (Note that we do want to eventually create a pytorch version of this, but that should also have no additional requirements over our regular pytorch install.)
At some point we will want to port over the version of the spatial pooler that uses the CPP bindings. What happens then? Does that go into yet another package? Or into the columns package? Or would we incorporate it into this package so that all SP implementations are in the same package?

If we want to keep this file and package as no CPP, the naming, comments, and README should reflect that.

lscheinkman commented 2 years ago

Thats a good point. I guess it would make sense to move SP to the columns package for now, so everything is consolidated in the same package and once we port the cpp code to pytorch/python we could just move it out of the packages into the frameworks directly and deprecate the package.

subutai commented 2 years ago

To be clear, I think we'll end up with two implementations: one that uses pytorch and one that uses CPP bindings.

abhi-iyer commented 2 years ago

This might be faulty logic on my part so please correct me if I'm wrong: if we keep it as a separate package with no dependencies other than nupic.research, we can update with PyTorch when necessary and still keep it in this standalone package. (i.e., if we do move to PyTorch, we wouldn't need the CPP bindings. PyTorch/NumPy implementations can both be in this spatial_pooler package.)

I forgot to make a README for this package -- I will do that.

subutai commented 2 years ago

I think the only reason for packages is to deal with dependencies. If there are no additional dependencies, you can just create a spatial_pooler directory under Frameworks and put the file there. PyTorch and Numpy versions can both be in there.

Tests would then go in nupic.research/tests/unit/frameworks

No separate install process will be needed - everyone gets access to the SP.

abhi-iyer commented 2 years ago

I think the only reason for packages is to deal with dependencies. If there are no additional dependencies, you can just create a spatial_pooler directory under Frameworks and put the file there. PyTorch and Numpy versions can both be in there.

Tests would then go in nupic.research/tests/unit/frameworks

No separate install process will be needed - everyone gets access to the SP.

I still prefer keeping it as a separate package for 2 reasons:

If we use additional dependencies in the future, we can easily add it to the setup.cfg in the spatial_pooler package. An example of this is if we move this code to PyTorch and require new SparseMatrix bindings for the CPU/GPU.
It's cleaner to have everything separated in a package (even though I understand that the original intent was to isolate dependencies). If the user chooses to use the spatial_pooler package, they can easily install it.

What do you think?

abhi-iyer commented 2 years ago

@subutai Will merge once I get your ok too