numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.34k stars 1.56k forks source link

nupic should build assuming nupic.core is a standard external library #1483

Closed rhyolight closed 9 years ago

rhyolight commented 9 years ago

The nupic.core build should result in a C library that can be put into the system path for the nupic build to use. This should allow us to remove cmake from nupic altogether and use a pure setuptools build and installation.

The nupic build will build the Python bindings, and treat nupic.core as an external library. It will not build nupic.core. It will only rely on the nupic.core release (include, bin, lib) and a git checkout is not needed. The nupic build will expect the release to be in standard places. A standard install into system paths will be handled automatically. For non-standard installs the user can set environment variables (e.g. include path) just like any other C project. You will also be able to specify a particular directory for the nupic.core install as an argument to the nupic build.

Here is some psuedocode the describes the imagined build logic:

determine nupic.core include/lib path and nupic install path (default to standard locations, overridable by user)

get SHA or version of expected nupic.core (could be NULL for "take anything” or “HEAD")

if nupic.core NOT in include/lib path:
  if archive available for SHA or version and this platform:
    download archive
  else:
    git clone nupic.core at specified SHA or version tag
    build nupic.core from source
  install nupic.core library in specified install path

validate expected nupic.core SHA/version through API
if not valid throw build error
build nupic in build/ directory
install into nupic install path

subutai commented 9 years ago

This is actually not an easy task. There are two competing issues:

I think our 3 choices are:

  1. Compile the bindings as part of nupic but do it in a way that is cleaner than what is done today. For example, the nupic build should just use precompiled nupic.core binaries instead of rebuilding everything using the nupic build system. However, some C++ compilation will be necessary. A number of Python libraries require C++ compilation, so this is not the end of the world.
  2. Compile the bindings as part of nupic.core. This means nupic.core would be dependent on Python and we would need to run a bunch of python tests within it. This will make nupic.core pretty bloated - not sure I like this approach. It gets worse as we support more and more languages, we don't want to have to compile the bindings for every language in nupic.core itself.
  3. Move the bindings compilations into a third repository. This would make installing nupic even more complex - I don't think we should do this.

None of these three choices are ideal, but of these I think 1) is still the best. I suggest the following approach: stick with 1 but have precompiled binaries for nupic and nupic.core available for a number of platforms. This will help the vast majority of nupic users. For nupic hackers who don't care about modifying nupic.core, they can use the nupic.core binaries. They will need to rebuild the bindings whenever nupic.core changes, but this is much better than today's scenario. Only nupic.core hackers will need to worry about building both.

cogmission commented 9 years ago

Subutai,

I really disagree :) I think there should be a nupic.bindings repository with all the bindings and then the install process would bundle the appropriate binding with nupic.core - still keeping it simple for the end user and also keeping the nupic and nupic.core libraries really clean. The complexity would just be packaging the core with the bindings.

So you would have nupic.core --> python binding : nupic.core --> java binding: nupic.core --> whatever binding

These would be packages available for download/install - really simple for the end user and cleaner for the hacker developers.

Regards, David

On Wed, Nov 5, 2014 at 11:17 AM, Subutai Ahmad notifications@github.com wrote:

This is actually not an easy task. There are two competing issues:

-

We need to compile the Python bindings somewhere. We don't have a choice about this. It has to be built in either in nupic, nupic.core, or somewhere else.

We want to keep nupic.core independent of any particular language. An original goal of nupic.core was to keep it a small and tight C++ library.

I think our 3 choices are:

1.

Compile the bindings as part of nupic but do it in a way that is cleaner than what is done today. For example, the nupic build should just use precompiled nupic.core binaries instead of rebuilding everything using the nupic build system. However, some C++ compilation will be necessary. A number of Python libraries require C++ compilation, so this is not the end of the world. 2.

Compile the bindings as part of nupic.core. This means nupic.core would be dependent on Python and we would need to run a bunch of python tests within it. This will make nupic.core pretty bloated - not sure I like this approach. It gets worse as we support more and more languages, we don't want to have to compile the bindings for every language in nupic.core itself. 3.

Move the bindings compilations into a third repository. This would make installing nupic even more complex - I don't think we should do this.

None of these three choices are ideal, but of these I think 1) is still the best. I suggest the following approach: stick with 1 but have precompiled binaries for nupic and nupic.core available for a number of platforms. This will help the vast majority of nupic users. For nupic hackers who don't care about modifying nupic.core, they can use the nupic.core binaries. They will need to rebuild the bindings whenever nupic.core changes, but this is much better than today's scenario. Only nupic.core hackers will need to worry about building both.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1483#issuecomment-61845344.

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

cogmission commented 9 years ago

Additional point. This "separates concerns" very cleanly and only puts the concern of specific integration on the onus of the binding build/dev.

On Wed, Nov 5, 2014 at 11:31 AM, cogmission1 . cognitionmission@gmail.com wrote:

Subutai,

I really disagree :) I think there should be a nupic.bindings repository with all the bindings and then the install process would bundle the appropriate binding with nupic.core - still keeping it simple for the end user and also keeping the nupic and nupic.core libraries really clean. The complexity would just be packaging the core with the bindings.

So you would have nupic.core --> python binding : nupic.core --> java binding: nupic.core --> whatever binding

These would be packages available for download/install - really simple for the end user and cleaner for the hacker developers.

Regards, David

On Wed, Nov 5, 2014 at 11:17 AM, Subutai Ahmad notifications@github.com wrote:

This is actually not an easy task. There are two competing issues:

-

We need to compile the Python bindings somewhere. We don't have a choice about this. It has to be built in either in nupic, nupic.core, or somewhere else.

We want to keep nupic.core independent of any particular language. An original goal of nupic.core was to keep it a small and tight C++ library.

I think our 3 choices are:

1.

Compile the bindings as part of nupic but do it in a way that is cleaner than what is done today. For example, the nupic build should just use precompiled nupic.core binaries instead of rebuilding everything using the nupic build system. However, some C++ compilation will be necessary. A number of Python libraries require C++ compilation, so this is not the end of the world. 2.

Compile the bindings as part of nupic.core. This means nupic.core would be dependent on Python and we would need to run a bunch of python tests within it. This will make nupic.core pretty bloated - not sure I like this approach. It gets worse as we support more and more languages, we don't want to have to compile the bindings for every language in nupic.core itself. 3.

Move the bindings compilations into a third repository. This would make installing nupic even more complex - I don't think we should do this.

None of these three choices are ideal, but of these I think 1) is still the best. I suggest the following approach: stick with 1 but have precompiled binaries for nupic and nupic.core available for a number of platforms. This will help the vast majority of nupic users. For nupic hackers who don't care about modifying nupic.core, they can use the nupic.core binaries. They will need to rebuild the bindings whenever nupic.core changes, but this is much better than today's scenario. Only nupic.core hackers will need to worry about building both.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1483#issuecomment-61845344.

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

rhyolight commented 9 years ago

I don't like (2) either. I have always wanted to get to (3) at some point down the road, although I'm not sure now is the right time. It would make more sense to establish a bindings protocol once nupic.core is completely independent (I think this wiki Step 2 is still valid here.

I'm fine with starting at (1) for initial cleanup, but keeping the original goal in sight to eventually split out into a bindings library, but now might not be the best time to do it. I'm much more anxious about getting a release process going than creating a bindings repo. That will complicate things, and I'd like to start out the release process as simple as possible.

cogmission commented 9 years ago

I forgot you guys already had this plan. That would make my suggestion premature then, you guys are right.

On Wed, Nov 5, 2014 at 11:36 AM, Matthew Taylor notifications@github.com wrote:

I don't like (2) either. I have always wanted to get to (3) at some point down the road, although I'm not sure now is the right time. It would make more sense to establish a bindings protocol once nupic.core is completely independent (I think this wiki Step 2 is still valid here https://github.com/numenta/nupic/wiki/nupic.core-Extraction-Plan#step-2-prepare-for-nupiccore-release .

I'm fine with starting at (1) for initial cleanup, but keeping the original goal in sight to eventually split out into a bindings library, but now might not be the best time to do it. I'm much more anxious about getting a release process going than creating a bindings repo. That will complicate things, and I'd like to start out the release process as simple as possible.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1483#issuecomment-61848589.

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

subutai commented 9 years ago

So you are advocating for 3) but have a single repository that contains all language bindings. My concern is that this will lead to repo and SHA hell. First, it immediately introduces another dependency. A nupic hacker will need to worry about all three repositories. Now, if we have another language, the bindings repository will be linked to the SHA's of two other repositories. This gets worse as we add more languages.

A second problem is that the single bindings repository will also need to contain tests and compilation scripts for all supported languages. This is going to be incredibly complex to maintain. Why should a Java developer have to worry about compiling Ruby bindings and tests?

If you really want to separate concerns, then you should have a separate bindings repo for each language. Maybe this is the best end goal, but I really don't think now is the time to introduce all this complexity. Let's walk before we run.

rhyolight commented 9 years ago

After all commenting at one time on this issue :smirk: I think we all agree on (1) for a start. No matter what direction we go from there, that is the place to start.

cogmission commented 9 years ago

Subutai,

I agree with a separate language repo. Internally I was thinking precisely that when I was saying "bindings" repo, but I was lopping them all together in my example. Distinct repositories makes more sense, and is the cleaner goal to move toward - eventually...

David

On Wed, Nov 5, 2014 at 11:45 AM, Matthew Taylor notifications@github.com wrote:

After all commenting at one time on this issue [image: :smirk:] I think we all agree on (2) for a start. No matter what direction we go from there, that is the place to start.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1483#issuecomment-61849909.

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

subutai commented 9 years ago

@cogmission Makes sense, thanks. We are having some more discussion about this today so I will be updating it (probably tomorrow) and would be great to get more feedback then.

oxtopus commented 9 years ago

I completely agree with #1 -- keep the bindings in nupic, but clean it up. One thing I'm not sure of, however, is what to do with CMake re:

This should allow us to remove cmake from nupic altogether and use a pure setuptools build and installation.

There are two different use-cases that need to be served:

  1. End users of nupic. Users who will never touch nupic.core or the internals of nupic. This would be, for example, production environments or anyone who will be using nupic, but not developing for it. In which case, we ought to optimize for user experience and ease of getting nupic up and running. Ideally, nupic is installable by name (pip install nupic or easy_install nupic), with binary editions available for Linux and OS X (maybe windows, some day), and also trivially installed from checkout (python setup.py install). No git, and no cmake ought to be required, although we can assume a modern compiler is available and rely on setuptools to handle the build/install, re-using nupic.core binary releases.
  2. Developers of nupic. Users who will be modifying nupic.core and/or nupic. In which case, we ought to optimize for developer productivity. I don't think we can rely on setuptools alone in order to provide a workflow that supports incremental builds, for example. Nor would you be able to easily work in an IDE like Xcode or eclipse.

I'd like to be able to address both classes of users, but if I had to choose one right now, it'd be developers. In which case, I think cmake should permanently stay (at least for the foreseeable future), with the idea being that once the current build is cleaned up and developers are more productive, we can then focus on the end user, making a cmake-based installation optional.

rhyolight commented 9 years ago

I agree with @oxtopus. Let's put the developers first, as long as we don't make it any harder for users with these enhancements. Improvements to the user installation flow will happen with pip after we get proper releases in place.