18F / rdbms-subsetter

Generates a subset of a relational database that respects foreign key constraints
Creative Commons Zero v1.0 Universal
313 stars 30 forks source link

Support non-schema referential constraints #8

Closed cite-reader closed 9 years ago

cite-reader commented 9 years ago

There are applications, mainly those built using MySQL before InnoDB became the default storage engine, which maintain referential integrity almost entirely in application code. While this is obviously an unfortunate circumstance, the reality of legacy support is that it is often more difficult to (obtain authorization to) add the relevant foreign key constraints to the database than it is to discover what they should be.

For those of us in such a position of legacy support, it would be useful to be able to specify such constraints to rdbms-subsetter so we can use it without critially confusing our applications under test.

cite-reader commented 9 years ago

As far as implementation goes, my initial browsing of the code makes me think that merging a list of correctly-structured objects to the tbl.fks object obtained at subsetter.py:180 would do the trick. This comes out of a call to sqlalchemy.engine.reflection.Inspector.get_foreign_keys; according to the docs in question we are looking at a list of dicts morally equivalent to the following type (using ML record type syntax, since it has a local minimum of boilerplate):

{
    constrained_columns: string list;
    referred_schema: string;
    referred_table: string;
    referred_columns: string list;
    name: string option
}

It seems difficult to devise a pleasing syntax for specifying such objects as command line parameters, but the sheer number of them required for the motivating use case I have in mind means I'd have wanted to read them from a file anyway, so I don't consider that a tremendous loss. (If I'm wrong there, I'd love to be corrected, of course.)

For specifying fake constraints in a file… JSON is an inoffensive structured format, and Python has a parser in the standard library so it wouldn't incur an additional dependency.

apelade commented 9 years ago

It looks like Jailer has that: http://dbeauty.sourceforge.net/

catherinedevlin commented 9 years ago

Thanks, @alexander-hill ! @jmcarp implemented this and submitted it (commit 13b87); I've merged it to master, and it will be in release 0.2.3.