Closed budlight closed 4 years ago
@budlight the utilities to produce fake data will not guarantee uniqueness, you can achieve that using a custom _cycle_over_sample_range
variation, which is a private/internal API thus not documented:
from anon.utils import _cycle_over_sample_range
# sample of 1_000_000 unique phone numbers (without formatting)
random_phone_number = _cycle_over_sample_range(1000000000, 9999999999, 1000000)
# example
print(next(random_phone_numbers)) # 1761296935
print(next(random_phone_numbers)) # 6273149983
print(next(random_phone_numbers)) # 6225408363
...
that is actually better than what I went with initially
mobile = partial(faker.numerify, text='+1##########')
are you really so concerned about performance that you think that should not be the default for phone number?
Even if we change it, we won’t be able to address uniqueness
yeah that is just a problem with the bulk update in general, the only way I could have solved this 100% correctly is with the ability to set all the values to None before using _cycle_over_sample_range(1000000000, 9999999999, 1000000)
probably the best solution is to clear all the unique values beforehand. That's all I could even see as a real improvement here is documentation suggesting to clear unique values if they accept None.
https://github.com/Tesorio/django-anon/blob/master/anon/utils.py#L217
I had issues with this using anon.fake_phone_number() producing duplicate phone numbers (unique=True)