Closed davidshiu3 closed 2 years ago
Hi, The seed data set has 150 rows by default.
You can change the amount of data by using the --seed-rows
option.
--seed-rows SEED_ROWS
Specify a number of rows to populate the
fake data table used during
anonymization. [$PYNONYMIZER_SEED_ROWS]
I have some outstanding work to improve the documentation on this one - I think it's easily missed.
Closing this issue because of inactivity. Please reply or open another issue if there's more to say 😇
Is your feature request related to a problem? Please describe. Currently we have a table with 100k rows and the seed data set only has 1000 entries (or however much, something much less than 100k). This causes many things to have the same repeated data.
Describe the solution you'd like Addition of a setting to be able to choose how much fake data we get of each type as well as more seed data.
Describe alternatives you've considered
Additional context We use Pynonymizer to anonymize our production data. This causes many repeated names throughout the scrubbed data.