bigscience-workshop / data_tooling

Tools for managing datasets for governance and training.
Apache License 2.0
74 stars 48 forks source link

Updated Anonymization #406

Closed ianyu93 closed 2 years ago

ianyu93 commented 2 years ago

apply_regex_anonymization function now takes iterables as arguments for tag_type. This allows targeted application of regex rules, such as ['ID', 'AGE']. By default, if no argument has been passed through, all keys under regex_rulebase is used.