rheinwerk-verlag / pganonymize

A commandline tool for anonymizing PostgreSQL databases
http://pganonymize.readthedocs.io/
Other
41 stars 26 forks source link

During Exclude if the Result is "None" then TypeError is Raised #14

Closed abhinavvaidya90 closed 3 years ago

abhinavvaidya90 commented 3 years ago

During the exclude of rows, if a column returns None value then the row[None] raises Type error

https://github.com/rheinwerk-verlag/postgresql-anonymizer/blob/5f6d7b3e1a9f4ae22e843eb1c6d57314a1939936/pganonymizer/utils.py#L101

Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/pganonymize", line 11, in <module>
    sys.exit(main())
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pganonymizer/__main__.py", line 10, in main
    main()
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pganonymizer/cli.py", line 71, in main
    anonymize_tables(connection, schema.get('tables', []), verbose=args.verbose)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pganonymizer/utils.py", line 38, in anonymize_tables
    data, table_columns = build_data(connection, table_name, columns, excludes, total_count, verbose)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pganonymizer/utils.py", line 68, in build_data
    if not row_matches_excludes(row, excludes):
  File "/home/ubuntu/.local/lib/python3.6/site-packages/pganonymizer/utils.py", line 101, in row_matches_excludes
    if pattern.match(row[column]):
TypeError: expected string or bytes-like object

Ref: myschema.yml

tables:
 - res_partner:
    fields:
     - name:
        provider:
          name: fake.name
     - email:
        provider:
          name: fake.email
    excludes: 
     - email:
        - "info.*@example.com"

Expected Behaviour :

hkage commented 3 years ago

I cannot reproduce the traceback using a virtualenv with Python 3.8 on Ubuntu and a sample database I used for developing the pganonymizer.

Is email a column that is not present in the res_partner table or does the regex pattern simply doesn't match any data within the table?

abhinavvaidya90 commented 3 years ago

@hkage For demo reason, I created a db with 1 table with 3 row Table :

image

this is my .yml image

Result on running script image

hkage commented 3 years ago

The fix will be included in the new bugfix release 0.3.2.