rwnx / pynonymizer

A universal tool for translating sensitive production database dumps into anonymized copies.
https://pypi.org/project/pynonymizer/
MIT License
102 stars 38 forks source link

SQL command returned non-zero exit status 1 with uppercase column names #71

Closed gmp042 closed 3 years ago

gmp042 commented 3 years ago

Hi Jerome,

I'm having an issue while using Pynonymizer with uppercase table names. It seems the SQL command decodes the table name as lowercase, returning an error.

subprocess.CalledProcessError: Command '['psql', '--host', 'localhost', '--port', '5432', '--username', 'postgres', '--dbname', 'strategy3_e30f5bf9e4b54379896fc2d037930bec', '--command', 'UPDATE APP AS "updatetarget" SET "APP_NAME" = ( SELECT company FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_ID" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_SECRET" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1);']' returned non-zero exit status 1.

strategy3_e30f5bf9e4b54379896fc2d037930bec=# UPDATE APP AS "updatetarget" SET "APP_NAME" = ( SELECT company FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_ID" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_SECRET" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1);
ERROR:  relation "app" does not exist
LINE 1: UPDATE APP AS "updatetarget" SET "APP_NAME" = ( SELECT compa...

Would you take a look?

gmp042 commented 3 years ago

I'm using version 1.20.0

rwnx commented 3 years ago

Hi, I think we need a little more information to diagnose the issue - what is the table actually called? and can you give me an example what you think that query should be ?

I'm a little confused by the output since in the query, it looks like the table name is already in uppercase.

gmp042 commented 3 years ago

Hi Jerome,

I did some research and found this answer in StackOverflow:

Identifiers (including column names) that are not double-quoted are folded to lower case in PostgreSQL. Column names that were created with double-quotes and thereby retained upper-case letters (and/or other syntax violations) have to be double-quoted for the rest of their life

So I tried the query as it is and the relation was folded to lowercase.

strategy3_75721d1a2a054d33bebd27eb5e2ee9ae=# UPDATE APP AS "updatetarget" SET "APP_NAME" = ( SELECT company FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_ID" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_SECRET" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1);']
ERROR:  relation "app" does not exist
LINE 1: UPDATE APP AS "updatetarget" SET "APP_NAME" = ( SELEC...
strategy3_75721d1a2a054d33bebd27eb5e2ee9ae'#

After double-quoting table's name, the query ran successfully.

strategy3_75721d1a2a054d33bebd27eb5e2ee9ae'# UPDATE "APP" AS "updatetarget" SET "APP_NAME" = ( SELECT company FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_ID" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1),"CLIENT_SECRET" = ( SELECT md5 FROM _pynonymizer_seed_fake_data ORDER BY RANDOM(), MD5("updatetarget"::text) LIMIT 1);']
strategy3_75721d1a2a054d33bebd27eb5e2ee9ae'#

Is it possible to always double-quote table's name as has been done for columns?

rwnx commented 3 years ago

Hi, thanks for looking into that! Yeah, this seems like a good fix. I can pull together a PR later this week, or i'm happy to look at PRs if you're down 😇

gmp042 commented 3 years ago

Hi, Jerome. Thanks for your time!