callumrollo / erddaplogs

Quick utilities for parsing nginx and apache logs for ERDDAP requests
MIT License
0 stars 4 forks source link

tests for anonymized data #50

Closed callumrollo closed 1 month ago

callumrollo commented 1 month ago

@ChrisJohnNOAA added some tests for the anonmized data. Anything else come to mind to check?

ChrisJohnNOAA commented 1 month ago

@ChrisJohnNOAA added some tests for the anonmized data. Anything else come to mind to check?

We should probably check that the columns for referer and some of the finer location columns (lat, lon, org, zip, city) are not there.

Should we have a test for the location table (parser.location)?

callumrollo commented 1 month ago

The location dataframe is pretty sparse atm. Not sure what we should check for? image

ChrisJohnNOAA commented 1 month ago

The location dataframe is pretty sparse atm. Not sure what we should check for? image

Any check would be to make sure we don't have pii in there. Potentially the check is that the only existing columns are country code, regionName, city, and len (which is the count).