Tesorio / django-anon

:shipit: Anonymize production data so it can be safely used in not-so-safe environments
https://django-anon.readthedocs.io/en/latest/
MIT License
161 stars 6 forks source link

Fix -- fake_email default max_size is too fragile #59

Closed caioariede closed 3 years ago

caioariede commented 3 years ago

Description

Many times we call anon.fake_email() will result in an error:

>>> anon.fake_email()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/caio/Projects/Tesorio/django-anon/anon/utils.py", line 353, in fake_email
    return fake_username(max_size, separator=".") + suffix
  File "/Users/caio/Projects/Tesorio/django-anon/anon/utils.py", line 328, in fake_username
    return fake_text(max_size, separator=separator) + random_number
  File "/Users/caio/Projects/Tesorio/django-anon/anon/utils.py", line 274, in fake_text
    text = text[: text.rindex(separator)]
ValueError: substring not found

This happens because of two problems:

  1. The fake_text function does not work well with short strings (< 14 chars). This is the length of the biggest word in the wordlist (see below), and it may cause fake_text to raise the exception above.

https://github.com/Tesorio/django-anon/blob/7a02db68f22a8770d08f41692aba1c06b42560fb/anon/utils.py#L6-L10

  1. The current defaults for fake_email are fake_email(max_size=25, suffix="@example.com"), which means there will be only 12 chars left to generate the first part of the address, as shown below:
aaaabbbbcccc @example.com total
12 chars 13 chars 25 chars

The first part (aaaabbbbcccc) is generated by fake_text, as explained before, don't like short strings

Solution

This PR addresses:

  1. Makes fake_text handle short strings
  2. Increases default max_size in fake_email to 40, which is len("@example.com") + (_max_word_size * 2)

Todos

github-actions[bot] commented 3 years ago

File Coverage
All files 95% :white_check_mark:
anon/init.py 63% :white_check_mark:
anon/base.py 97% :white_check_mark:
anon/utils.py 83% :white_check_mark:
tests/compat.py 50% :white_check_mark:
tests/test_base.py 99% :white_check_mark:

Minimum allowed coverage is 50%

Generated by :monkey: cobertura-action against ecde51aef842b0a1f0b5ed2dcc9ea82cb281f76a