ihmeuw / pseudopeople

pseudopeople is a Python package that generates realistic simulated data about a fictional United States population, designed for use in testing entity resolution (record linkage) methods or other data science algorithms at scale.
https://pseudopeople.readthedocs.io
BSD 3-Clause "New" or "Revised" License
19 stars 2 forks source link

row noising without test updates #459

Closed hussain-jafari closed 1 week ago

hussain-jafari commented 1 week ago

row noising without test updates

Description

Use NoiseConfiguration instead of LayeredConfigTree for row noising and get_value instead of key references. Add method for checking whether a config has a particular row noise type. Update get_noise_level function to accept noise type string as first argument.

Note that builds are failing because of test files which will be resolved in upcoming PRs.

Testing

Tests pass (in upcoming PRs)