J-PAL / PII-Scan

R code to scan for obvious PII.
MIT License
15 stars 8 forks source link

Only matching lowercase file extension #15

Closed umeditor closed 7 years ago

umeditor commented 7 years ago

We currently only scanning files that end with .dta, . sas7bdat or .csv. Should we ignore case, and also search for .DTA, . SAS7BDAT or .CSV? If so, what about mixed cased (.dtA, .sas7BDAT, .cSv, etc.)?

jamesturitto commented 7 years ago

I dont know. Can files be stored as mixed case?

umeditor commented 7 years ago

Yes - they're just file names. I'd say lets do all upper case and all lowercase. (I can imagine .CSV and .csv, but not really .cSv)

joshjacobson commented 7 years ago

I'd think there's really no reason to be so specific... I'd transform the extension to lowercase, and then do one check.

umeditor commented 7 years ago

Let's not deal with this until it's a real issue. We've got enough work on our plates for now.