joke2k / faker

Faker is a Python package that generates fake data for you.
https://faker.readthedocs.io
MIT License
17.65k stars 1.92k forks source link

uuid4() - become unusuable for safe typing without casting or ignoring #2042

Open sshishov opened 5 months ago

sshishov commented 5 months ago

When using fake.uuid4(), the returning type is bytes | str | UUID which is impossible to safely use later on in the code. I guess one way to solve it to use literals instead of cast_to callable, as there are limited ways to cast it to. Only 3 types, therefore better to use Literal as cast_to and provide @typing.overload to properly infer return type from the input param. Also by default it should return str

Steps to reproduce

Nypy will produce error as for type checker the value can be either bytes or str or UUID which cannot be joined together. But in runtime here will no be any errors

value1 = fake.uuid4()
value2 = fake.uuid4()

joined = ','.join([value1, value2])  # <-- errors will be here

Expected behavior

No typing errors should be observed

Actual behavior

Using fake.uuid4() without typing.cast or type: ignore appears impossible after adding the types.

stefan6419846 commented 5 months ago

Feel free to provide a corresponding PR which improves this.

sshishov commented 5 months ago

@stefan6419846 it will require breaking changes, is it okay? I definitely can provide PR for this one

fcurella commented 4 months ago

Can't this be fixed by using some kind of TypeVar in the type hint of uuid4?

steve-mavens commented 4 months ago

Based on the type in the docs:

def uuid4(cast_to: ~typing.Callable[[~uuid.UUID], str] | ~typing.Callable[[~uuid.UUID], bytes] | None = <class 'str'>) → bytes | str | UUID

Like fcurella says, I think it should be possible to overload it like:

T = TypeVar('T', str, bytes)

@overload
def uuid4() -> str: ...
@overload
def uuid4(cast_to: Callable[[UUID], T]) -> T: ...
@overload
def uuid4(cast_to: None) -> UUID: ...

Failing that, get rid of T by explicitly listing both possibilities:

@overload
def uuid4() -> str: ...
@overload
def uuid4(cast_to: Callable[[UUID], str]) -> str: ...
@overload
def uuid4(cast_to: Callable[[UUID], bytes]) -> bytes: ...
@overload
def uuid4(cast_to: None) -> UUID: ...

I don't think that should be a breaking change.

Or potentially you could make the function more general, allowing cast_to to convert to any type at all instead of just str or bytes:

T = TypeVar('T')

@overload
def uuid4() -> str: ...
@overload
def uuid4(cast_to: Callable[[UUID], T]) -> T: ...
@overload
def uuid4(cast_to: None) -> UUID: ...
github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.

sshishov commented 1 month ago

Let me guys try to create MR for this and we can see how far forward we can go with this...