Open bbrewington opened 1 year ago
Conversation in Slack:
- Define list of columns containing PII, and types (Phone Number, Name, Address)
- Write code to do the masking (Python or SQL?)
- Perform masking and refresh data
Might be a good idea to do this as SQL views, landing the masked data in a BigQuery dataset we would give people access to. Then could maybe set up Google Sheets that query those views (so whenever changes are made, it's propagated instan
uploaded masking py file mask-data.py. this file could be used to carry out the masking that is asked for in this issue, however we need to know the exact forms of the Name, Phone Number and Address columns in other to construct the appropriate regex to match those columns, because I only have the PII stripped version of data its not possible to get this info from the data file distributed.
- Define list of columns containing PII, and types (Phone Number, Name, Address)
- Write code to do the masking (Python or SQL?)
- Perform masking and refresh data
Might be a good idea to do this as SQL views, landing the masked data in a BigQuery dataset we would give people access to. Then could maybe set up Google Sheets that query those views (so whenever changes are made, it's propagated instantly)
- Define list of columns containing PII, and types (Phone Number, Name, Address)
- Write code to do the masking (Python or SQL?)
- Perform masking and refresh data
uploaded masking py file mask-data.py. this file could be used to carry out the masking that is asked for in this issue, however we need to know the exact forms of the Name, Phone Number and Address columns in other to construct the appropriate regex to match those columns, because I only have the PII stripped version of data its not possible to get this info from the data file distributed.
@Itguru14 here's some phone Number patterns I'm seeing in Salesforce Opportunity.Description (and there can be multiple occurrences in a single cell)
555.867.5309
555-867-5309
(555) 867-5309
5 5, 58675309 <-- this one may have been speech to text or something
Per conversation w/ Joey, I'm adding myself as owner and he's going to work on this collaborating with Adeseye (I might be handing off to Adeseye fully)
I can reach out to Joey to add him also as collaborator or you can send me his email address. I didnt get a chance to work on the regex because I have been busy with setting up Tableau and Tableau prep. I will work on it today.
On Thu, Jul 20, 2023 at 12:14 PM Brent Brewington @.***> wrote:
Per conversation w/ Joey, I'm adding myself as owner and he's going to work on this collaborating with Adeseye (I might be handing off to Adeseye fully)
— Reply to this email directly, view it on GitHub https://github.com/Itguru14/tag-dssg-2023-lbc/issues/3#issuecomment-1644210134, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASUAXVFAMJXFSKQFPZCWNFLXRFKMVANCNFSM6AAAAAA2M63SFA . You are receiving this because you were mentioned.Message ID: @.***>
Might be a good idea to do this as SQL views, landing the masked data in a BigQuery dataset we would give people access to. Then could maybe set up Google Sheets that query those views (so whenever changes are made, it's propagated instantly)