rheinwerk-verlag / pganonymize

A commandline tool for anonymizing PostgreSQL databases
http://pganonymize.readthedocs.io/
Other
42 stars 26 forks source link

Enhance Integer and Float Handling in Provider Set #61

Open bizzappdev opened 5 months ago

bizzappdev commented 5 months ago

Improvements for integers and floats and how they are managed within the provider set.

When the field's data type is Integer or Numeric, the set provider raises an error because the values are not properly typecasted.

This PR provides proper typecast for integer and float/numeric datatype fields.

hkage commented 5 months ago

Thanks for your contribution and this PR. I will take a look into your changes asap.

hkage commented 5 months ago

Can you give me an example when the error occurs (e.g. the table structure and YAML schema?) Does the error only occurs when you pass integer or float values with quotes within the YAML schema? Just to be able to reproduce the behavior.

In my testings I was able to set integer values, as long as I didn't quote them but couldn't set any numeric or float fields, no matter what branch I used (development or your pull request).

     Column      |          Type          | Collation | Nullable | Default 
-----------------+------------------------+-----------+----------+---------
 id              | bigint                 |           |          | 
 first_name      | character varying(250) |           | not null | 
 last_name       | character varying(250) |           | not null | 
 data            | jsonb                  |           |          | 
 metadata        | json                   |           |          | 
 position        | integer                |           |          | 
 price           | numeric                |           |          | 
 price_exclusive | double precision       |           |          | 
tables:
 - auth_user:
    primary_key: id
    chunk_size: 5000
    fields:
     - first_name:
        provider:
          name: fake.first_name
     - last_name:
        provider:
          name: set
          value: "Bar"
     - position:
         provider:
           name: set
           value: 1000
     - price:
         provider:
           name: set
           value: 90.99
     - price_exclusive:
         provider:
           name: set
           value: 200.99

Before the anonymization:

anonymization_test=# select first_name, last_name, position, price, price_exclusive from auth_user;
 first_name | last_name | position | price | price_exclusive 
------------+-----------+----------+-------+-----------------
 Daniel     | Bar       |      500 |       |                

After calling the anonymization:

 first_name | last_name | position | price | price_exclusive 
------------+-----------+----------+-------+-----------------
 Jennifer   | Bar       |     1000 |       |                
hkage commented 5 months ago

Can you give me just a small example how to force the error (so I can test and review this pull request)? Especially for the numeric types. As mentioned above I could only reproduce an error for integer types if I use double quotes.

bizzappdev commented 5 months ago

I will prepare in 1 or 2 days and will share the details with you.

fblackburn1 commented 3 months ago

Hi @bizzappdev from this comment, @hkage is waiting for this PR before preparing a new release. Do you think you will have time to address feedback soon? No pressure, I just want to know if it worth to request a release without this PR :)

hkage commented 3 months ago

Hi @bizzappdev from this comment, @hkage is waiting for this PR before preparing a new release. Do you think you will have time to address feedback soon? No pressure, I just want to know if it worth to request a release without this PR :)

I could prepare a new release, even if this PR is still open. But as long as I can't reproduce the bug, I cannot merge it.

My suggestion would be to create a new minor version and release this bugfix (as soon as I can test it) as a bugfix release afterwards.

fblackburn1 commented 3 months ago

My suggestion would be to create a new minor version and release this bugfix (as soon as I can test it) as a bugfix release afterwards.

Yes it will be nice, thank you :smiley: