rheinwerk-verlag / pganonymize

A commandline tool for anonymizing PostgreSQL databases
http://pganonymize.readthedocs.io/
Other
42 stars 26 forks source link

Fix table and column name quotes in cursor.copy_from call #22

Closed nurikk closed 3 years ago

nurikk commented 3 years ago

Hi, it seems like this commit introduced regression when table name and columns had been quoted like 'tbl_name', this caused issue in copy_from, when calling cursor.copy_from. This PR fixes it

hkage commented 3 years ago

Hi :-)

Do you have an example of when this traceback occurs? Is it when you add column or table quotes directly to your YAML scheme file?

Unfortunately whith this change we get the error mentioned in https://github.com/rheinwerk-verlag/postgresql-anonymizer/issues/16 again and uppercase column names won't be recognized during the cursor.copy_from call. E.g.:

tables:
  - customers:
      fields:
        - first_name:
            provider:
              name: fake.first_name
        - last_name:
            provider:
              name: fake.last_name
        - TITLE:
            provider:
              name: fake.prefix
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/henning/Projekte/postgresql-anonymizer/pganonymizer/__main__.py", line 18, in <module>
    main()
  File "/home/henning/Projekte/postgresql-anonymizer/pganonymizer/__main__.py", line 10, in main
    main()
  File "/home/henning/Projekte/postgresql-anonymizer/pganonymizer/cli.py", line 71, in main
    anonymize_tables(connection, schema.get('tables', []), verbose=args.verbose)
  File "/home/henning/Projekte/postgresql-anonymizer/pganonymizer/utils.py", line 43, in anonymize_tables
    import_data(connection, column_dict, table_name, table_columns, primary_key, data)
  File "/home/henning/Projekte/postgresql-anonymizer/pganonymizer/utils.py", line 151, in import_data
    copy_from(connection, data, temp_table, table_columns)
  File "/home/henning/Projekte/postgresql-anonymizer/pganonymizer/utils.py", line 129, in copy_from
    cursor.copy_from(new_data, table, sep=COPY_DB_DELIMITER, null='\\N', columns=columns)
psycopg2.errors.UndefinedColumn: column "title" of relation "tmp_customers" does not exist

Maybe there is another way to preserve the case without using quotes?

nurikk commented 3 years ago

Fixed, can you help to verify? @hkage

hkage commented 3 years ago

Fixed, can you help to verify? @hkage

Looks good to me with a test setup of uppercase columns! Does it also still fix your original issue with the copy_from call?

nurikk commented 3 years ago

Fixed, can you help to verify? @hkage

Looks good to me with a test setup of uppercase columns! Does it also still fix your original issue with the copy_from call?

Yes, it fixes all my issues.

Mind if I ask you to publish new release after merging this pr?

Btw thanks for your work, really saved me a lot of time

hkage commented 3 years ago

Fixed, can you help to verify? @hkage

Looks good to me with a test setup of uppercase columns! Does it also still fix your original issue with the copy_from call?

Yes, it fixes all my issues.

Mind if I ask you to publish new release after merging this pr?

Btw thanks for your work, really saved me a lot of time

Great! I am glad I could help with this project. Although I know I still have a lot of work to complete the tests ;-)

Sure, I will publish a new release with both of your pull requests.

hkage commented 3 years ago

Version 0.5.0 has been uploaded to the PyPi: https://pypi.org/project/pganonymize/0.5.0/