dataanon / data-anon

Data Anonymization implementation in Kotiln
https://dataanon.github.io/data-anon/
36 stars 9 forks source link

Unable to anonymise on Postgres with capitalised column names #6

Open konradmars opened 5 years ago

konradmars commented 5 years ago

Unable to anonymise on Postgres with capitalised column names

Reproduction:

  1. Having Postgres database with capitalised column e.g.: image
  2. Try to anonymise the table Observed:
    
    Sep 05, 2019 7:16:34 PM com.github.dataanon.jdbc.TableReader <init>
    INFO: READ SQL: SELECT "DESCRIPTION","FAILURE_REASON","ID" FROM "AO_88263F_HEALTH_CHECK_STATUS"
    Sep 05, 2019 7:16:34 PM org.jline.utils.Log logr
    WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
    Sep 05, 2019 7:16:34 PM com.github.dataanon.jdbc.TableWriter <init>
    INFO: WRITE SQL: UPDATE "AO_88263F_HEALTH_CHECK_STATUS" SET  "DESCRIPTION" = ? ,  "FAILURE_REASON" = ?  WHERE  "ID" = ? 
    Sep 05, 2019 7:16:35 PM com.github.dataanon.dsl.Strategy executeOnTable
    SEVERE: Error processing table '"AO_88263F_HEALTH_CHECK_STATUS"': org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet.
    reactor.core.Exceptions$BubblingException: org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet.
    at reactor.core.Exceptions.bubble(Exceptions.java:154)
    at reactor.core.publisher.Operators.onErrorDropped(Operators.java:263)
    at reactor.core.publisher.BaseSubscriber.onError(BaseSubscriber.java:182)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onError(FluxMapFuseable.java:128)
    at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:235)
    at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:202)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:156)
    at reactor.core.publisher.BaseSubscriber.request(BaseSubscriber.java:212)
    at com.github.dataanon.jdbc.TableWriter.hookOnSubscribe(TableWriter.kt:36)
    at reactor.core.publisher.BaseSubscriber.onSubscribe(BaseSubscriber.java:146)
    at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:90)
    at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:140)
    at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:64)
    at reactor.core.publisher.FluxMapFuseable.subscribe(FluxMapFuseable.java:63)
    at com.github.dataanon.dsl.Strategy.executeOnTable(Strategy.kt:40)
    at com.github.dataanon.dsl.Strategy.access$executeOnTable(Strategy.kt:16)
    at com.github.dataanon.dsl.Strategy$execute$1.accept(Strategy.kt:29)
    at com.github.dataanon.dsl.Strategy$execute$1.accept(Strategy.kt:16)
    at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:130)
    at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.runAsync(FluxPublishOn.java:396)
    at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.run(FluxPublishOn.java:480)
    at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84)
    at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet.
    at org.postgresql.jdbc.PgResultSet.findColumn(PgResultSet.java:2548)
    at org.postgresql.jdbc.PgResultSet.getObject(PgResultSet.java:2539)
    at com.github.dataanon.jdbc.TableReader.columnValue(TableReader.kt:37)
    at com.github.dataanon.jdbc.TableReader.toField(TableReader.kt:34)
    at com.github.dataanon.jdbc.TableReader.next(TableReader.kt:31)
    at com.github.dataanon.jdbc.TableReader.next(TableReader.kt:9)
    at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:231)
    ... 24 more
Expected: All works well

The anonymise function:

fun BlacklistTable.anonymizeHealthCheckStatus() { anonymize("\"DESCRIPTION\"").using(RandomFormattedString()) anonymize("\"FAILURE_REASON\"").using(RandomFormattedString()) }



Note the column names need to be in quotes. Without it Postgres auto-decapitalise column name (lowercase column will not be found).
sunitparekh commented 5 years ago

Give me few days, I will merge your pull request and add tests and publish new jar soon.

On Thu, Sep 5, 2019 at 2:51 PM Konrad Marszałek notifications@github.com wrote:

Unable to anonymise on Postgres with capitalised column names

Reproduction:

  1. Having Postgres database with capitalised column e.g.: [image: image] https://user-images.githubusercontent.com/8700391/64328709-4c061800-d011-11e9-90fd-e52c592eb8d6.png

    1. Try to anonymise the table Observed:

Sep 05, 2019 7:16:34 PM com.github.dataanon.jdbc.TableReader INFO: READ SQL: SELECT "DESCRIPTION","FAILURE_REASON","ID" FROM "AO_88263F_HEALTH_CHECK_STATUS" Sep 05, 2019 7:16:34 PM org.jline.utils.Log logr WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information) Sep 05, 2019 7:16:34 PM com.github.dataanon.jdbc.TableWriter INFO: WRITE SQL: UPDATE "AO_88263F_HEALTH_CHECK_STATUS" SET "DESCRIPTION" = ? , "FAILURE_REASON" = ? WHERE "ID" = ? Sep 05, 2019 7:16:35 PM com.github.dataanon.dsl.Strategy executeOnTable SEVERE: Error processing table '"AO_88263F_HEALTH_CHECK_STATUS"': org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet. reactor.core.Exceptions$BubblingException: org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet. at reactor.core.Exceptions.bubble(Exceptions.java:154) at reactor.core.publisher.Operators.onErrorDropped(Operators.java:263) at reactor.core.publisher.BaseSubscriber.onError(BaseSubscriber.java:182) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onError(FluxMapFuseable.java:128) at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:235) at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:202) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:156) at reactor.core.publisher.BaseSubscriber.request(BaseSubscriber.java:212) at com.github.dataanon.jdbc.TableWriter.hookOnSubscribe(TableWriter.kt:36) at reactor.core.publisher.BaseSubscriber.onSubscribe(BaseSubscriber.java:146) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:90) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:140) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:64) at reactor.core.publisher.FluxMapFuseable.subscribe(FluxMapFuseable.java:63) at com.github.dataanon.dsl.Strategy.executeOnTable(Strategy.kt:40) at com.github.dataanon.dsl.Strategy.access$executeOnTable(Strategy.kt:16) at com.github.dataanon.dsl.Strategy$execute$1.accept(Strategy.kt:29) at com.github.dataanon.dsl.Strategy$execute$1.accept(Strategy.kt:16) at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:130) at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.runAsync(FluxPublishOn.java:396) at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.run(FluxPublishOn.java:480) at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84) at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet. at org.postgresql.jdbc.PgResultSet.findColumn(PgResultSet.java:2548) at org.postgresql.jdbc.PgResultSet.getObject(PgResultSet.java:2539) at com.github.dataanon.jdbc.TableReader.columnValue(TableReader.kt:37) at com.github.dataanon.jdbc.TableReader.toField(TableReader.kt:34) at com.github.dataanon.jdbc.TableReader.next(TableReader.kt:31) at com.github.dataanon.jdbc.TableReader.next(TableReader.kt:9) at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:231) ... 24 more

Expected: All works well

The anonymise function:

fun BlacklistTable.anonymizeHealthCheckStatus() { anonymize("\"DESCRIPTION\"").using(RandomFormattedString()) anonymize("\"FAILURE_REASON\"").using(RandomFormattedString()) }

Note the column names need to be in quotes. Without it Postgres auto-decapitalise column name (lowercase column will not be found).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dataanon/data-anon/issues/6?email_source=notifications&email_token=AABLNOTH4A4HUB553LVSZ4LQIDFQXA5CNFSM4IT3PIW2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJPGQPA, or mute the thread https://github.com/notifications/unsubscribe-auth/AABLNOR775LIPFYXVT4KYYTQIDFQXANCNFSM4IT3PIWQ .

-- thanks & regards, Sunit Parekh +91 99237 00662 parekh.sunit@gmail.com

konradmars commented 5 years ago

Thanks Sunit, do you think it is possible to publish jar this week?

With kind regards, Konrad Marszałek

On Fri, 6 Sep 2019 at 07:11, Sunit Parekh notifications@github.com wrote:

Give me few days, I will merge your pull request and add tests and publish new jar soon.

On Thu, Sep 5, 2019 at 2:51 PM Konrad Marszałek notifications@github.com wrote:

Unable to anonymise on Postgres with capitalised column names

Reproduction:

  1. Having Postgres database with capitalised column e.g.: [image: image] < https://user-images.githubusercontent.com/8700391/64328709-4c061800-d011-11e9-90fd-e52c592eb8d6.png

  2. Try to anonymise the table Observed:

Sep 05, 2019 7:16:34 PM com.github.dataanon.jdbc.TableReader INFO: READ SQL: SELECT "DESCRIPTION","FAILURE_REASON","ID" FROM "AO_88263F_HEALTH_CHECK_STATUS" Sep 05, 2019 7:16:34 PM org.jline.utils.Log logr WARNING: Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information) Sep 05, 2019 7:16:34 PM com.github.dataanon.jdbc.TableWriter INFO: WRITE SQL: UPDATE "AO_88263F_HEALTH_CHECK_STATUS" SET "DESCRIPTION" = ? , "FAILURE_REASON" = ? WHERE "ID" = ? Sep 05, 2019 7:16:35 PM com.github.dataanon.dsl.Strategy executeOnTable SEVERE: Error processing table '"AO_88263F_HEALTH_CHECK_STATUS"': org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet. reactor.core.Exceptions$BubblingException: org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet. at reactor.core.Exceptions.bubble(Exceptions.java:154) at reactor.core.publisher.Operators.onErrorDropped(Operators.java:263) at reactor.core.publisher.BaseSubscriber.onError(BaseSubscriber.java:182) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onError(FluxMapFuseable.java:128) at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:235) at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:202) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:156) at reactor.core.publisher.BaseSubscriber.request(BaseSubscriber.java:212) at com.github.dataanon.jdbc.TableWriter.hookOnSubscribe(TableWriter.kt:36) at reactor.core.publisher.BaseSubscriber.onSubscribe(BaseSubscriber.java:146) at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:90) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:140) at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:64) at reactor.core.publisher.FluxMapFuseable.subscribe(FluxMapFuseable.java:63) at com.github.dataanon.dsl.Strategy.executeOnTable(Strategy.kt:40) at com.github.dataanon.dsl.Strategy.access$executeOnTable(Strategy.kt:16) at com.github.dataanon.dsl.Strategy$execute$1.accept(Strategy.kt:29) at com.github.dataanon.dsl.Strategy$execute$1.accept(Strategy.kt:16) at reactor.core.publisher.LambdaSubscriber.onNext(LambdaSubscriber.java:130) at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.runAsync(FluxPublishOn.java:396) at reactor.core.publisher.FluxPublishOn$PublishOnSubscriber.run(FluxPublishOn.java:480) at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84) at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.postgresql.util.PSQLException: The column name "DESCRIPTION" was not found in this ResultSet. at org.postgresql.jdbc.PgResultSet.findColumn(PgResultSet.java:2548) at org.postgresql.jdbc.PgResultSet.getObject(PgResultSet.java:2539) at com.github.dataanon.jdbc.TableReader.columnValue(TableReader.kt:37) at com.github.dataanon.jdbc.TableReader.toField(TableReader.kt:34) at com.github.dataanon.jdbc.TableReader.next(TableReader.kt:31) at com.github.dataanon.jdbc.TableReader.next(TableReader.kt:9) at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:231) ... 24 more

Expected: All works well

The anonymise function:

fun BlacklistTable.anonymizeHealthCheckStatus() { anonymize("\"DESCRIPTION\"").using(RandomFormattedString()) anonymize("\"FAILURE_REASON\"").using(RandomFormattedString()) }

Note the column names need to be in quotes. Without it Postgres auto-decapitalise column name (lowercase column will not be found).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/dataanon/data-anon/issues/6?email_source=notifications&email_token=AABLNOTH4A4HUB553LVSZ4LQIDFQXA5CNFSM4IT3PIW2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HJPGQPA , or mute the thread < https://github.com/notifications/unsubscribe-auth/AABLNOR775LIPFYXVT4KYYTQIDFQXANCNFSM4IT3PIWQ

.

-- thanks & regards, Sunit Parekh +91 99237 00662 parekh.sunit@gmail.com

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/dataanon/data-anon/issues/6?email_source=notifications&email_token=ACCMDZYQURS5Y4HP5S4R5FLQIHQ6NA5CNFSM4IT3PIW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6BXYMA#issuecomment-528710704, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCMDZZ2MR2PDTFZFEE4HYTQIHQ6NANCNFSM4IT3PIWQ .