IATI / IATI-Datastore

An open-source datastore for IATI data with RESTful web API providing XML, JSON, CSV plus ETL tools
http://datastore.iatistandard.org/
Other
1 stars 0 forks source link

Having more than one worker causes database conflicts #230

Closed Bjwebb closed 3 years ago

Bjwebb commented 9 years ago

If there is more than one iati queue background worker, we get IntegrityErrors due to violation of the Organisation uniqueness constraint.

andylolz commented 7 years ago

In case it’s useful, here’s an example stack trace of this:

11:42:44 IntegrityError: (IntegrityError) duplicate key value violates unique constraint "organisation_ref_name_type_key"
DETAIL:  Key (ref, name, type)=(, Directorate-general Development Cooperation and Humanitarian Aid, 10) already exists.
 'INSERT INTO organisation (ref, name, type) VALUES (%(ref)s, %(name)s, %(type)s) RETURNING organisation.id' {'ref': u'', 'type': u'10', 'name': u'Directorate-general Development Cooperation and Humanitarian Aid'}
Traceback (most recent call last):
  File "/IATI-Datastore/env/lib/python2.7/site-packages/rq/worker.py", line 411, in perform_job
    rv = job.perform()
  File "/IATI-Datastore/env/lib/python2.7/site-packages/rq/job.py", line 343, in perform
    self._result = self.func(*self.args, **self.kwargs)
  File "/IATI-Datastore/iati_datastore/iatilib/crawler.py", line 297, in update_activities
    parse_resource(resource)
  File "/IATI-Datastore/iati_datastore/iatilib/crawler.py", line 250, in parse_resource
    parse_activity(new_identifiers, old_xml, resource)
  File "/IATI-Datastore/iati_datastore/iatilib/crawler.py", line 232, in parse_activity
    db.session.flush()
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 149, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1814, in flush
    self._flush(objects)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1896, in _flush
    flush_context.execute()
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 372, in execute
    rec.execute(self)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 525, in execute
    uow
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 63, in save_obj
    table, insert)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 565, in _emit_insert_statements
    execute(statement, params)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 664, in execute
    params)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 764, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 878, in _execute_context
    context)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 871, in _execute_context
    context)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 320, in do_execute
    cursor.execute(statement, parameters)
IntegrityError: (IntegrityError) duplicate key value violates unique constraint "organisation_ref_name_type_key"
DETAIL:  Key (ref, name, type)=(, Directorate-general Development Cooperation and Humanitarian Aid, 10) already exists.
 'INSERT INTO organisation (ref, name, type) VALUES (%(ref)s, %(name)s, %(type)s) RETURNING organisation.id' {'ref': u'', 'type': u'10', 'name': u'Directorate-general Development Cooperation and Humanitarian Aid'}

Running max one worker sucks – it would be great to fix this one.