cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
48 stars 13 forks source link

Remove sensitive Elavon data from Pipeline #3334

Closed evansiroky closed 1 month ago

evansiroky commented 7 months ago

User story / feature request

In order to comply with Caltrans security parameters, we should remove all sensitive Elavon data from our data pipeline.

  1. We need to remove data we have already collected from raw data and any BigQuery tables that ingested the data.
  2. We need to modify the DAG task that ingests Elavon Data so that it does store any sensitive data in our cloud system.

Acceptance Criteria

The sensitive data that needs to be removed includes:

  1. account_number
  2. routing_number
  3. customer_name - appears to be the name of the Tap-to-ride program and not individual customer names.
  4. card_no

Notes

This issue can be separated into 2 phases:

  1. Scope the needed changes to implement the necessary changes and share with Caltrans IT Security for review.
  2. Implement the changes.
akosmatzon commented 4 months ago

Hi @evansiroky & @charlie-costanzo, I've just seen this issue and wanted to let you know that we currently use card_no to track chargeback transactions back to the original payment. There is no other way of figuring out whether a chargeback occurred, and this is quite important for the transit agencies (at least we know it is for CCJPA) not to pay a refund for a transaction that was already been paid through Visa/Mastercard (a.k.a. a chargeback).

(We definitely use customer_name, in almost all Elavon-related reports, that is the merchant's name.) We don't use account_number, that is the last four digits of the masked PAN (the masked card number). We don't use routing number.

Is it possible to keep card_no in the dataset? It is masked, it is not sensitive data, we do see this kind of data in other transit agencies data warehouses and reports too.

evansiroky commented 4 months ago

@akosmatzon, thanks for bringing this to our attention. I will relay this information to Caltrans IT Security and see what we can do.

evansiroky commented 2 months ago

As of 9/10 awaiting confirmation from Caltrans Security that it is ok to keep this data since there is a legitimate business need for keeping this information.

evansiroky commented 1 month ago

Assuming that no response on this matter since July 28, 2024 means that this can be closed. I'll re-open if needed.