USEPA / harmonize-wq

Standardize, clean, and wrangle Water Quality Portal data into more analytic-ready formats
https://usepa.github.io/harmonize-wq/
MIT License
12 stars 5 forks source link

Generalize wet_dry functions in clean module #52

Open jbousquin opened 7 months ago

jbousquin commented 7 months ago

clean.wet_dry_checks() uses a combination of column values to identify questionable field values. It was constructed to capture a suspiscious dry sediment measure with water as it's 'ActivityMediaName'. Given a dict of columns and criteria it could be used to mask and then update to a given value.

wet_dry_drop() drops rows based on a filter. It too could be generalized, or updated to instead add a qa_flag for later removal (what everything else has transitioned to).