18F / rdbms-subsetter

Generates a subset of a relational database that respects foreign key constraints
Creative Commons Zero v1.0 Universal
313 stars 30 forks source link

Execution fails with missing foreign key error #27

Open brki opened 8 years ago

brki commented 8 years ago

Experienced on the master branch, on commit 6ece4d939d27b9b8504704b33b61a79a460a467b .

I was able to work around this by:

             if target.completeness_score() > 0.97:
                 break
             (source_row, prioritized) = target.source.next_row()
+            target.fks = self.tables[(target.schema, target.name)].fks
+            target.child_fks = self.tables[(target.schema, target.name)].child_fks
             self.create_row_in(source_row, target_db, target,
                                prioritized=prioritized)

The missing foreign key error I saw was:

Traceback (most recent call last):
  File "/path/to/code/rdbms-subsetter/subsetter.py", line 451, in <module>
    generate()
  File "/path/to/code/rdbms-subsetter/subsetter.py", line 447, in generate
    source.create_subset_in(target)
  File "/path/to/code/rdbms-subsetter/subsetter.py", line 358, in create_subset_in
    target_db.flush()
  File "/path/to/code/rdbms-subsetter/subsetter.py", line 311, in flush
    self.conn.execute(table.insert(), list(table.pending.values()))
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute
    return meth(self, multiparams, params)
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1166, in _execute_context
    self._root._commit_impl(autocommit=True)
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 690, in _commit_impl
    self._handle_dbapi_exception(e, None, None, None, None)
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception
    exc_info
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 688, in _commit_impl
    self.engine.dialect.do_commit(self.connection)
  File "/path/to/.virtualenvs/rdbms-subsetter/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 423, in do_commit
    dbapi_connection.commit()
sqlalchemy.exc.IntegrityError: (psycopg2.IntegrityError) insert or update on table "foo" violates foreign key constraint "foo_constaint_name"
DETAIL:  Key (bar_id)=(544e83f4-2a8a-4103-9d27-7663baadb60f) is not present in table "bar".
brki commented 8 years ago

I've noticed that I do sometimes still have an error, even with my described workaround. e.g when trying to re-enable the fk constraints, I see:

ERROR:  insert or update on table "foo" violates foreign key constraint "foo_constraint_name"
DETAIL:  Key (bar_id)=(77a2d8a0-23ab-4661-bd85-dc194e56d4e5) is not present in table "bar".