numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.34k stars 1.56k forks source link

fast head change during core extraction #596

Closed sjmackenzie closed 10 years ago

sjmackenzie commented 10 years ago

numenta/nupic.core git subtree is completed. Please reference nupic.core

I have removed all history that doesn't pertain to the nta folder. The extraction is an as-is removal of the nta folder.

In an effort to reduce freeze time on nupic/nta my suggestion is to:

My suggestion is to only start on nupic.core build scripts after nupic swaps to cmake build scripts. As this will allow nupic.core to have a build script that can be built independently of nupic yet at the same time be compatible with nupic's build.

Let's discuss the way forward.

rhyolight commented 10 years ago

I like the idea of a fast integration so nupic is building off nupic.core quickly. We can come in later and clean up nupic.core to build itself. I know that @oxtopus will have some comments about the submodule. @scottpurdy please review and comment if necessary.

rhyolight commented 10 years ago

Relates to #590.

oxtopus commented 10 years ago

I would strongly advise against using git submodule here to link numenta/nupic.core to numenta/nupic. In fact, I don't even think it's safe to assume that nupic.core will always nestle into nupic the way nta/currently currently does.

Remember: For some, nupic.core will come in the form of a binary-only release that gets installed and they never see or touch it, even if they are actively developing on nupic (or future language-specific implementations). For nupic.core to reach its goals of being portable and accessible, it needs to stand on its own.

sjmackenzie commented 10 years ago

It will be a mistake to distribute nupic.core as a CLI binary. The idea is to build executables linking to a system wide nupic.core shared library. The last thing we need is another OpenSSH debacle.

I'm fully aware that nupic.core will not be a permanent resident of the nupic directory structure.

This is a one of many possible measures, that makes sure nupic has no head for as short amount of time as possible. I was under the impression that nupic needed an almost immediate head transplant. This is my stop-gap solution. What do you suggest in place of git submodules?

Secondly, what is the problem with git submodules in this context?

subutai commented 10 years ago

I agree with both of you. I don't think @oxtopus said CLI binary (static libraries are also binaries). nupic.core should build static libraries, plus test and sample executables. A binary pre-built release would include all of that for specific platforms. We may also want to output shared libraries - ?? There should also be a clear set of clean include files that comprise the API.

scottpurdy commented 10 years ago

Not really sure where this feedback goes but I think the convention for C++ projects is to have a top level src directory rather than building from root. That allows you do build everything by specifying a single directory rather than specifying algorithms, types, etc and keeps the code separate from other stuff.

sjmackenzie commented 10 years ago

Library distribution is tangent to this discussion. I am fully aware of @oxtopus "Remember" point. Please remember that I have been advocating this from practically day 1.

May we keep on topic and discuss the stop-gap measure. If not git submodules then what?

sjmackenzie commented 10 years ago

@scottpurdy the requirement was to extract it as-is then iterate from there via pull requests.

scottpurdy commented 10 years ago

It was in an nta directory before so I wouldn't call this as-is. But also not picky about when it happens, just want to put the idea out there.

rhyolight commented 10 years ago

This ticket is about getting nupic building off the current nupic.core quickly. We can hash out the structural details (both build and dir structure) later. To close this ticket, we'll need a PR on nupic that adjusts its build to pull in nupic.core and build it as it exists right now. nupic.core does not currently have a build script to build itself, and we can work on that (as defined in #584) in parallel with this ticket.

The discussion on this issue should be only about the technical details of that initial dependency. @oxtopus, if you have concerns about using a git submodule, you should explain what the dangers of this are and propose alternatives. We know that nupic.core will have it's own build process once #584 is complete. I think eventually it would be best for nupic's build to simply run the nupic.core build process as any client would do it. But for now, the issue is getting the dependency in place quickly so this C++ code freeze is as short as possible. @sjmackenzie is trying to help us out in this regard by creating this ticket.

Please keep comments on-topic and relevant to this issue.

rhyolight commented 10 years ago

@sjmackenzie said:

Please remember that I have been advocating this from practically day 1.

Actually, you've been advocating this since before the code was even open sourced! :wink:

scottpurdy commented 10 years ago

This is all fine with me once the question of how to pull core into nupic is resolved. I really like that we were able to keep just the nta-related history in the new repo - didn't know that was possible!

oxtopus commented 10 years ago

re: linking the repositories, do we need to do anything at all? Initially, I'd be ok with requiring separate checkouts and deferring a decision once we have a better idea on how the nupic.core build will change.

rhyolight commented 10 years ago

Does it matter how the nupic.core build will change, as long as we know there will eventually be a build script there? I don't think nupic should ever be in a state where people have to check out two repos manually to build and run, because this will immediately break anyone with a nupic checkout who pulls the latest code and expects it to build (a logical expectation).

We have the tools to link them without concerning ourselves with the core build process, we just have to agree on how to do it. Stewart proposes using a git submodule, and although I've personally had troubles with them for linking more than two repositories, it's probably fine for a simple 1v1 dependency.

With the submodule route, there will always be a SHA association between nupic and nupic.core, but it's baked into the submodule mechanism. If nupic.core gets updated and a new SHA is head of master, the nupic repo's submodule association will need to be updated properly (manually) to point to the new nupic.core SHA (head of master), which will require a commit and push within nupic for that new association.

If we decide not to use a submodule, we could have the build manually fetch nupic.core, but we'd still need to have a SHA checked into source code somewhere that we update whenever we want to point nupic to a different SHA in nupic.core. (I don't think we ever want to have nupic implicitly dependent on the master branch of nupic.core.) Our build script will need to read the SHA out of a file somewhere and pull the right version before building.

My vote is to go ahead with the submodule, because all my woes with them involved a complex repo hierarchy with several submodules involved and duplicate submodule dependencies between them. I think we'll be okay with one simple submodule.

@oxtopus If you have arguments against this, please tell us.

oxtopus commented 10 years ago

@rhyolight re: your first point, true, with submodules, you don't have to manually checkout two different repositories, but you do have to manually run esoteric submodule-specific commands for it all to work, and you have to know in advance that you need to do it. As a user, I have to clone one repo (easy), and follow through with the submodule steps (not so easy).

I don't envision the builds being integrated, and we need not fret over linking the repositories. I don't think we should make the assumption that everyone who pulls down nupic should also have to build nupic.core, either. Some may be perfectly happy installing it from a pre-built binary, or building from a non-repo source release. Meanwhile, if they do want to build nupic.core from the source repository, then it's done separately.

oxtopus commented 10 years ago

FWIW, as long as we're confining this discussion to the context of keeping nupic stable while we work out the details, I'll approve a Pull Request linked to this issue that includes a submodule into nupic.core. Of course, that assumes:

  1. Travis-CI builds nupic and all tests pass
  2. User need not do anything extra -- git submodule commands are built into the nupic build.sh file.
rhyolight commented 10 years ago

That works for me, too.

rhyolight commented 10 years ago

Can someone checkout https://github.com/numenta/nupic/pull/597 and try it out for me? All you need to do is run the ./build.sh (hopefully).

rhyolight commented 10 years ago

Ok, #597 removes all C++ from nupic and replaces with a nupic.core submodule at /nta. The only change I needed to make to the build was to add the proper git submodule commands and update the travis config to ignore the submodule (it automatically expands them by default). The merge of this PR should get us the simplest working nupic that builds nupic.core as a submodule.