numenta / nupic-legacy

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.
http://numenta.org/
GNU Affero General Public License v3.0
6.34k stars 1.56k forks source link

Remove MySQL dependency from swarming (use SQLlite) #1577

Open rhyolight opened 9 years ago

rhyolight commented 9 years ago

May be obsolete because of #1236.

cogmission commented 9 years ago

Question: Is there a configuration that is platform independent? Like accessible by most languages? Is SQLlite that?

On Mon, Dec 1, 2014 at 11:51 AM, Matthew Taylor notifications@github.com wrote:

May be obsolete because of #1236 https://github.com/numenta/nupic/issues/1236.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1577.

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

rhyolight commented 9 years ago

The configuration should be platform independent. @scottpurdy tried to move from MySQL to SQLlite awhile back, but ran into problems, so there may be some differences in SQL syntax, but I'm not sure. I'm also not certain this is even necessary with #1236, but it's been on my todo list forever so thought I'd create a ticket.

cogmission commented 9 years ago

I took a look at #1236 https://github.com/numenta/nupic/issues/1236 and to me it doesn't really bear upon platform independence. #1236 https://github.com/numenta/nupic/issues/1236 is about decoupling and module independence, which could be accomplished without having platform independence. Platform independence takes this one step further but doesn't impact #1236 https://github.com/numenta/nupic/issues/1236. So I guess platform independence should be another issue? ...maybe that depends on the prior completion of #1236 https://github.com/numenta/nupic/issues/1236?

On Mon, Dec 1, 2014 at 12:18 PM, Matthew Taylor notifications@github.com wrote:

The configuration should be platform independent. @scottpurdy https://github.com/scottpurdy tried to move from MySQL to SQLlite awhile back, but ran into problems, so there may be some differences in SQL syntax, but I'm not sure. I'm also not certain this is even necessary with

1236 https://github.com/numenta/nupic/issues/1236, but it's been on my

todo list forever so thought I'd create a ticket.

— Reply to this email directly or view it on GitHub https://github.com/numenta/nupic/issues/1577#issuecomment-65109242.

We find it hard to hear what another is saying because of how loudly "who one is", speaks...

rhyolight commented 9 years ago

1236 is going to be complex, potentially a complete rebuild of swarming, depending on how involved @scottpurdy gets. If I remember correctly, a part of his plan was to swap out MySQL with SQLlite as a part of the work.

scottpurdy commented 9 years ago

I think @cogmission is right that pulling hypersearch into a standalone library and removing the dependency on mysql are separate issue. It might be unnecessarily messy to do the latter before the former though.

rhyolight commented 9 years ago

Agreed, I just wanted to get this issue logged to make sure it was not forgotten. I had it in a personal TODO list.

tristanls commented 9 years ago

The mysql dependency comes as a surprise following the One Hot Gym tutorial (https://github.com/numenta/nupic/tree/master/examples/opf/clients/hotgym/prediction/one_gym):

[root@38796793db34 one_gym]# ./swarm.py 
This script runs a swarm on the input data (rec-center-hourly.csv) and
creates a model parameters file in the `model_params` directory containing
the best model found by the swarm. Dumps a bunch of crud to stdout because
that is just what swarming does at this point. You really don't need to
pay any attention to it.

=================================================
= Swarming on rec-center-hourly data...
= Medium swarm. Sit back and relax, this could take awhile.
=================================================
Generating experiment files in directory: /github/numenta/nupic/examples/opf/clients/hotgym/prediction/one_gym/swarm...
Writing  313 lines...
Writing  113 lines...
done.
None
WARNING:com.numenta.nupic.database.ClientJobsDAO.ClientJobsDAO:[] First failure in <function connect at 0x20e5938>; initial retry in 0.1 sec.; timeoutSec=300. Caller stack:
  File "./swarm.py", line 109, in <module>
    swarm(INPUT_FILE)
  File "./swarm.py", line 101, in swarm
    modelParams = swarmForBestModelParams(SWARM_DESCRIPTION, name)
  File "./swarm.py", line 78, in swarmForBestModelParams
    verbosity=0
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/swarming/permutations_runner.py", line 276, in runWithConfig
    return _runAction(runOptions)
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/swarming/permutations_runner.py", line 217, in _runAction
    returnValue = _runHyperSearch(runOptions)
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/swarming/permutations_runner.py", line 146, in _runHyperSearch
    search = _HyperSearchRunner(runOptions)
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/swarming/permutations_runner.py", line 414, in __init__
    self.__cjDAO = _clientJobsDB()
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/swarming/permutations_runner.py", line 378, in _clientJobsDB
    return cjdao.ClientJobsDAO.get()
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/support/decorators.py", line 59, in exceptionLoggingWrap
    return func(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/database/ClientJobsDAO.py", line 566, in get
    cjDAO.connect()
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/support/decorators.py", line 59, in exceptionLoggingWrap
    return func(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/support/decorators.py", line 241, in retryWrap
    timeoutSec, ''.join(traceback.format_stack()), exc_info=True)
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/support/decorators.py", line 214, in retryWrap
    result = func(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/database/ClientJobsDAO.py", line 655, in connect
    with ConnectionFactory.get() as conn:
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/database/Connection.py", line 167, in get
    return cls._connectionPolicy.acquireConnection()
  File "/usr/lib64/python2.7/site-packages/nupic-0.0.39.dev0-py2.7-linux-x86_64.egg/nupic/database/Connection.py", line 553, in acquireConnection
    dbConn = self._pool.connection(shareable=False)
  File "/usr/lib/python2.7/site-packages/DBUtils/PooledDB.py", line 331, in connection
    con = self.steady_connection()
  File "/usr/lib/python2.7/site-packages/DBUtils/PooledDB.py", line 279, in steady_connection
    *self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/DBUtils/SteadyDB.py", line 134, in connect
    failures, ping, closeable, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/DBUtils/SteadyDB.py", line 186, in __init__
    self._store(self._create())
  File "/usr/lib/python2.7/site-packages/DBUtils/SteadyDB.py", line 190, in _create
    con = self._creator(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/pymysql/__init__.py", line 88, in Connect
    return Connection(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 626, in __init__
    self._connect()
  File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 818, in _connect
    2003, "Can't connect to MySQL server on %r (%s)" % (self.host, e))
OperationalError: (2003, "Can't connect to MySQL server on 'localhost' ([Errno 111] Connection refused)")

It would help to document it in the meantime.

ghost commented 9 years ago

@tristanls According to website (http://stackoverflow.com/questions/23614624/cant-connect-as-root-without-a-password-sqlalchemy), I followed this command in mysql console (after typing mysql -u root -p)

use mysql; update user set password=null where User='root' and Host='localhost'; flush privileges;

After that, one-hot gym works

scottpurdy commented 9 years ago

fyi @oxtopus - hopefully we can move towards a Hypersearch that allows entirely in-memory, sqlite or mysql (for swarming on a cluster) or something like this. Just to keep in mind while working on #1236

pehlert commented 9 years ago

:+1: I'd like to see MySQL go, too, but am not sure about swarming on a cluster and your work on #1236. Do you think it would be appropriate to implement an adapter-style pattern for the swarming database and is that something that would be worth working on with #1236 in mind?

oxtopus commented 9 years ago

@pehlert yes, ideally. The storage engine should be configurable, be it in-memory, mysql, sqlite, etc. And not tightly integrated into the algorithm and PSO implementation itself. I'm currently reviewing options and am curious to know what you had in mind.

pehlert commented 9 years ago

My basic idea is to implement the adapter pattern and have different adapter classes (MySQLAdapter, MemoryAdapter...) which act as a proxy towards the underlying storage engine, similar to what modern ORM frameworks are implementing). I'd be willing to take care of the implementation, too. Should be fairly straightforward imo. @scottpurdy, what do you think?

oxtopus commented 9 years ago

@rhyolight I think this specific issue is obsolete as you suggest, or at least there will be a subtask created as a result of #1236 that will replace this one.

pehlert commented 9 years ago

Not sure, wouldn't it be possible to work on the two issues in parallel? Especially with the open discussion in #1236 which will probably delay that one

oxtopus commented 9 years ago

@pehlert Yes, it's possible. However, I would consider anything related to swarming to have a very limited lifetime.

pehlert commented 9 years ago

We will still have swarming and need to coordinate processes in the future, though. So I doubt that this would we worthless

scottpurdy commented 9 years ago

Might as well leave this open to track the specific functionality request (run swarming without MySQL) but I agree it probably isn't a good idea to work on this functionality in the existing Hypersearch code since a lot of that will change soon.