cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
44 stars 12 forks source link

Remove sensitive Littlepay data from Pipeline #3333

Open evansiroky opened 2 months ago

evansiroky commented 2 months ago

User story / feature request

In order to comply with Caltrans security parameters, we should remove all sensitive Littlepay data from our data pipeline.

  1. We need to remove data we have already collected from raw data and any BigQuery tables that ingested the data.
  2. We need to modify the DAG task that ingests Littlepay Data so that it does store any sensitive data in our cloud system.

Acceptance Criteria

The sensitive data that needs to be removed includes:

  1. customer_id? This one is a little bit of a gray area since it is a hash and directly personally identifiable.
  2. masked_pan

Notes

This issue can be separated into 2 phases:

  1. Scope the needed changes to implement the necessary changes and share with Caltrans IT Security for review.
  2. Implement the changes.
ohrite commented 1 week ago

From weekly meeting 7/2 https://docs.google.com/document/d/1HEY67UoDRIyJgiYYa4TFZCohfrxF-kAAO373RznfOw4/edit?usp=sharing