dssg / givinggraph

An API tool to help understand the relationships between non-profits, for-profits, and the causes they support.
https://github.com/dssg/givinggraph/wiki/API
MIT License
28 stars 13 forks source link

Get SQLAlchemy working correctly across Celery processes #13

Closed JohnHBrock closed 10 years ago

JohnHBrock commented 10 years ago

A SQL Alchemy class called Nonprofit is passed to various Celery tasks. When our code tries accessing members of the Nonprofit instance, we get this error:

[2013-08-19 16:09:51,628: ERROR/MainProcess] Task tasks.add_news_articles_to_db_for_nonprofit[59041d16-7680-4214-a711-2756bdda9807] raised exception: DetachedInstanceError('Instance <Nonprof it at 0x1670a430> is not bound to a Session; attribute refresh operation cannot proceed',) Traceback (most recent call last): File "C:\Python27\lib\site-packages\celery\task\trace.py", line 233, in trace_task R = retval = fun(_args, _kwargs) File "C:\Python27\lib\site-packages\celery\task\trace.py", line 420, in protected_call return self.run(_args, _kwargs) File "D:\givinggraph\givinggraph\tasks.py", line 199, in add_news_articles_to_db_for_nonprofit logger.info('Inside add_news_articles_to_db_for_nonprofit(nonprofit) for nonprofits_id {0}'.format(nonprofit.nonprofits_id)) File "C:\Python27\lib\site-packages\sqlalchemy\orm\attributes.py", line 316, in get return self.impl.get(instancestate(instance), dict) File "C:\Python27\lib\site-packages\sqlalchemy\orm\attributes.py", line 611, in get value = callable_(passive) File "C:\Python27\lib\site-packages\sqlalchemy\orm\state.py", line 375, in call self.manager.deferred_scalar_loader(self, toload) File "C:\Python27\lib\site-packages\sqlalchemy\orm\loading.py", line 555, in load_scalar_attributes (state_str(state))) None: Instance <Nonprofit at 0x1670a430> is not bound to a Session; attribute refresh operation cannot proceed

JohnHBrock commented 10 years ago

Relevant: http://prschmid.blogspot.com/2013/04/using-sqlalchemy-with-celery-tasks.html

aronwc commented 10 years ago

Hmm... if that gets too hairy, can we just pass nonprofits_id instead (then have to retrieve object from DB in each task)?

JohnHBrock commented 10 years ago

Yeah, I think that's what we'll have to do. It looks like serialization is passing the SQL Alchemy instances to the tasks, but the reference to the DB session on the original process is invalid.