button-inc / emissions-elt-demo

2 stars 0 forks source link

Research-spike: Google DLP vs Postgres Anonymizer extension #136

Open joshgamache opened 1 year ago

joshgamache commented 1 year ago

Reason for spike: Postgres Anonymizer extension cannot be installed directly on a Google Cloud SQL instance.

Questions to answer:

Postgres Anonymizer

Google DLP

Notes

joshgamache commented 1 year ago

Research conclusions

Google DLP(Data Loss Prevention) will be the way to go at this stage. It can be integrated with dags to do analysis and anonymization on data within the Data Clean Room workflows. Post anonymization analysis can be done with K-Anonymity reports as well.

Local testing was done to ensure the API addressed annonymization and analysis as we'd expected. Reports can be output to a queryable location, which could then be used with a Foreign Data Wrapper in Postgres to be able to portray the data we want in the front end.

Answers

Pg Anonymizer

Google DLP

(Added to Notion dev notes)