lk-geimfari / mimesis

Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
https://mimesis.name
MIT License
4.39k stars 330 forks source link

Issue with seed for username #340

Closed lk-geimfari closed 6 years ago

lk-geimfari commented 6 years ago

The method returns different data each time:

>>> from mimesis import Personal
>>> personal = Personal('en', seed=0xf)

>>> personal.username()
'Bloodshedder_1805'
>>> personal.username()
'bloodshedder.1805'
>>> personal.username()
'bloodshedder1805'

Also, it works slowly and we should fix.

lk-geimfari commented 6 years ago

@duckyou Your participation would be really helpful.

lk-geimfari commented 6 years ago

An early example of an alternative implementation which works faster:

def username(template=None) -> str:
    name = choice(USERNAMES)
    date = randint(1800, 2070)

    templates = ('U_d', 'U.d', 'U-d', 'ld', 'l-d', 'Ud',
                 'l.d', 'l_d', 'default')

    if template is None:
        template = choice(templates)

    if template not in templates:
        raise KeyError()

    if template == 'Ud':
        return '{}{}'.format(name.capitalize(), date)
    elif template == 'U.d':
        return '{}.{}'.format(name.capitalize(), date)
    elif template == 'ld':
        return '{}{}'.format(name, date)
    elif template == 'U-d':
        return '{}-{}'.format(name.title(), date)
    elif template == 'U_d':
        return '{}_{}'.format(name.title(), date)
    elif template == 'l-d':
        return '{}-{}'.format(name, date)
    elif template == 'l_d':
        return '{}_{}'.format(name, date)

    return '{}.{}'.format(name, date)

Yeah, I know that it looks like a piece of shit, but:

[0.06383157s] alternative_implementation(10000) -> None
[0.15332460s] current_implementation(10000) -> None
duckyou commented 6 years ago

Good job, @lk-geimfari! Also we can use lambdas 😉:

    def new_username(self, template: Optional[str] = None) -> str:
        name = self.random.choice(USERNAMES)
        date = str(self.random.randint(1800, 2070))

        templates = ('U_d', 'U.d', 'U-d', 'ld', 'l-d', 'Ud',
                     'l.d', 'l_d')

        if template is None:
            template = self.random.choice(templates)
        elif template not in templates:
            raise KeyError()

        templatez = {
            # UppercaseDate
            'Ud': lambda: name.capitalize() + date,
            # Uppercase.Date
            'U.d': lambda: name.capitalize() + '.' + date,
            # lowercaseDate
            'ld': lambda: name + date,
            # Uppercase-date
            'U-d': lambda: name.title() + '-' + date,
            # Uppercase_date
            'U_d': lambda: name.title() + '_' + date,
            # lowercase-date
            'l-d': lambda: name + '-' + date,
            # lowercase_date
            'l_d': lambda: name + '_' + date,
            # lowercase.date
            'l.d': lambda: name + '.' + date,
        }

        return templatez[template]()

screenshot_2017-12-29_18-45-41

For more micro optimizations we can also take out templatez into self scope.

duckyou commented 6 years ago

More benchs: screenshot_2017-12-29_19-02-40

self scoped lambdas username solution not so faster than solution with if's

lk-geimfari commented 6 years ago

@duckyou So, let's add the faster solution.

lk-geimfari commented 6 years ago

@duckyou Can I hope for you PR?

lk-geimfari commented 6 years ago

You know, we will use solution which I have published above. At this moment this is a faster solution. Anyway, we can update it in the future, when we implement the much faster solution.