lk-geimfari / mimesis

Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
https://mimesis.name
MIT License
4.39k stars 330 forks source link

Seed does not work for some providers #325

Closed lk-geimfari closed 6 years ago

lk-geimfari commented 6 years ago

Description: Seems like @duckyou have found a bug in our code and seed is really does not work sometime and we should fix it.

Reason: It's happening because we use custom_code which use another random object.

Decision: We should move utils which use random to helpers.Random

duckyou commented 6 years ago

Thanks for the quick response, @lk-geimfari :smile:

I can try moving custom_code to BaseProvider like you said in #324 discussion

lk-geimfari commented 6 years ago

@duckyou You know, it's better to move it to helpers.Random. There's just everything related to random selection.

lk-geimfari commented 6 years ago

@duckyou after that, we can just call self.random.custom_code. Looks good.

lk-geimfari commented 6 years ago

@duckyou If you don't want to do it, I can do it by myself, but if you want work on it, then, please, create new PR, because old is really hard to read and review :smile: . Thanks!

duckyou commented 6 years ago

@lk-geimfari Hmm... You goddamn right! That's may be simplest solution :relaxed:

I do this ;)

lk-geimfari commented 6 years ago

@duckyou Thank you!

duckyou commented 6 years ago

@lk-geimfari https://github.com/lk-geimfari/mimesis/blob/af9c877d9bde654dff2f8be524b81a35460f2307/mimesis/enums.py#L6-L12

Method get_random_item also affected :worried:

lk-geimfari commented 6 years ago

@duckyou We will not change it, because it's not really about data, but about arguments for function.

duckyou commented 6 years ago

@lk-geimfari Issue explain:

>>> from mimesis import Address
>>> from mimesis.enums import CountryCode
>>> a = Address(seed=42)
>>> a.country_iso_code(fmt=CountryCode.ISO3)
'NIC'
>>> a.country_iso_code(fmt=CountryCode.ISO3)
'BRN'
>>> a.country_iso_code(fmt=CountryCode.ISO3)
'ARM'
>>> a = Address(seed=42)
>>> a.country_iso_code(fmt=CountryCode.ISO3)
'NIC'
>>> a.country_iso_code(fmt=CountryCode.ISO3)
'BRN'
>>> a.country_iso_code(fmt=CountryCode.ISO3)
'ARM'
...

This worked because we set format directly ... but if we omit this parameter

...
>>> a = Address(seed=42)
>>> a.country_iso_code()
'558'
>>> a.country_iso_code()
'096'
>>> a.country_iso_code()
'ARM'
>>> a = Address(seed=42)
>>> a.country_iso_code()
'NIC'
>>> a.country_iso_code()
'096'
>>> a.country_iso_code()
'051'

its happen beause: https://github.com/lk-geimfari/mimesis/blob/af9c877d9bde654dff2f8be524b81a35460f2307/mimesis/providers/address.py#L147-L148

lk-geimfari commented 6 years ago

@duckyou Skip this for now. I'll think about it later. Our priority for a right now is custom_code.

lk-geimfari commented 6 years ago

Fixed. Now we should think about how should we test providers with seed.

duckyou commented 6 years ago

@lk-geimfari good job! 😸 Sorry for off-top, but maybe this project community have a place for discussions.. for example irc or slack?

lk-geimfari commented 6 years ago

@duckyou It's okay. Unfortunately, not yet. We have a small community, so usually, we discuss everything here, although it would be great, of course, to have a place where we could discuss everything.

We can create repository discussion in @mimesis-lab. I can add you to the team if you're interested in this project.

lk-geimfari commented 6 years ago

@duckyou I have invited you to @mimesis-lab. Accept the invitation if you're interested. Thanks!

duckyou commented 6 years ago

@lk-geimfari im in! ^^

duckyou commented 6 years ago

@lk-geimfari don't forget about https://github.com/lk-geimfari/mimesis/issues/325#issuecomment-352364359 ?

Heres two problems: AllowRandom.get_random_item and ValidateEnumMixin._validate_enum

Ok.. firstly how to get random enum with seed? Original realisation is: https://github.com/lk-geimfari/mimesis/blob/217e5fa7775389b92a166f643276b2533a5fb45e/mimesis/enums.py#L11-L12 aka we can just do this self.random.choice(list(CountryCode)) without mentioned method

and if we create this simple hack in AllowRandom metaclass:

    def __getitem__(self, index):
        if isinstance(index, int):
            return list(self._member_map_)[index]
        return super().__getitem__(index)

we can do now self.random.choice(CountryCode) :wink:

lk-geimfari commented 6 years ago

@duckyou Also we can add the method get_random_item() to helpers.Random.

def get_item(self, enum: Optional[Any] = None):
    return self.choice(list(self)) 

What do you think? In my opinion, it will be easier.

lk-geimfari commented 6 years ago

In this case, we should move _validate_enum() to BaseProvider

duckyou commented 6 years ago

@lk-geimfari hmmm self.random.get_enum_item(CountryCode) or self.random.choice(CountryCode) ... also enum attribute always should be passed to get_enum_item method but if we combine this with _validate_enum...

key = self.random.get_enum_item(fmt, CountryCode)

Something like this:

def get_enum_item(self, enum: Any, item: Any = None) -> Any:
    if item:
        raise NonEnumerableError(enum) if item not in enum
        return item
    return self.choice(list(enum))
lk-geimfari commented 6 years ago

@duckyou Stop stop. Seems like the solution which I suggested above will not work, it seems.

Do you suggest something like this?

class Enum(enum.Enum):
    def __getitem__(self, index):
        if isinstance(index, int):
            return list(self._member_map_)[index]
        return super().__getitem__(index)

class PortRange(Enum):
    ALL = (1, 65535)
    WELL_KNOWN = (1, 1023)
    EPHEMERAL = (49152, 65535)
    REGISTERED = (1024, 49151)
lk-geimfari commented 6 years ago

@duckyou The problem that we need to handle different cases. For example, sometimes we need the value of the enum, but sometimes we need item of enum object, i.e Enum.ITEM, not Enum.ITEM.value.

duckyou commented 6 years ago

Oh... here is:

class MimEnumMeta(EnumMeta):
    def __getitem__(self, index):
        if isinstance(index, int):
            return list(self)[index]
        return super().__getitem__(index)

class PortRange(Enum, metaclass=MimEnumMeta):
    ALL = (1, 65535)
    WELL_KNOWN = (1, 1023)
    EPHEMERAL = (49152, 65535)
    REGISTERED = (1024, 49151)

...

Now:

>>> from mimesis import enums
>>> import random
>>> pr = enums.PortRange
>>> pr
<enum 'PortRange'>
>>> pr['ALL']
<PortRange.ALL: (1, 65535)>
>>> pr.ALL
<PortRange.ALL: (1, 65535)>
>>> pr[0]
<PortRange.ALL: (1, 65535)>
>>> random.choice(pr)
<PortRange.REGISTERED: (1024, 49151)>
>>> random.choice(pr) in enums.PortRange
True
>>> random.choice(pr).value
(1, 1023)
>>> random.choice(pr).name
'EPHEMERAL'
lk-geimfari commented 6 years ago

@duckyou I'll wait for your PR. Keep in mind that we have builtins data providers and that sometimes we need an item of the enum, but not a value (personla.py -> surname, payment.py -> credit_card_number).

duckyou commented 6 years ago

@lk-geimfari Can i also move _validate_enum() to BaseProvider, like you say here https://github.com/lk-geimfari/mimesis/issues/325#issuecomment-353350713 ?

lk-geimfari commented 6 years ago

@duckyou If we do it, then tests will fail because ValidateEnumMixin used in mimesis.builtins. It's really hard to understand what is better. At this moment we can only experiment with solutions.

duckyou commented 6 years ago

@lk-geimfari I see here several solutions:

  1. Just pass random object to method
  2. Like you say here https://github.com/lk-geimfari/mimesis/issues/325#issuecomment-353350713
  3. Init random object in ValidateEnumMixin
  4. Just in provider method validate enum and raise exception
lk-geimfari commented 6 years ago

@duckyou Let's will weigh the pros and cons of all solutions.

lk-geimfari commented 6 years ago

@duckyou We need to take a decision that will allow us to make the minimum number of changes.

lk-geimfari commented 6 years ago

@duckyou I have found a new problem: Seeded full_name() returns different data because there are two different methods with similar seed, i.e name and surname.

And this is only the tip of the iceberg I think. Now I'm tired and can not figure out why the data is changing for fixed seed, sorry.

I have created PR where you can see changes: https://github.com/lk-geimfari/mimesis/pull/332

lk-geimfari commented 6 years ago

Seems like builtins providers also does not support of seed. We should fix it!