joke2k / faker

Faker is a Python package that generates fake data for you.
https://faker.readthedocs.io
MIT License
17.65k stars 1.92k forks source link

Type annotations on `md5`, `sha1`, and `sha256` should be overloaded #2046

Closed samueljsb closed 2 weeks ago

samueljsb commented 4 months ago

The docstrings for these functions state:

If raw_output is False (default), a hexadecimal string representation of the MD5 hash will be returned. If True, a bytes object representation will be returned instead.

But the type annotations do not reflect this:

$ cat t.py
import faker

fake = faker.Faker()

x = fake.md5(raw_output=False)
reveal_type(x)

y = fake.md5(raw_output=True)
reveal_type(y)

$ mypy t.py
$ mypy t.py
t.py:6: note: Revealed type is "Union[builtins.bytes, builtins.str]"
t.py:9: note: Revealed type is "Union[builtins.bytes, builtins.str]"
Success: no issues found in 1 source file

The revealed types ought to be builtins.str and builtins.bytes respectively.

I believe this can be achieved by overloading the type annotations on those methods (using typing.overload.

I would be happy to open a PR to make those changes, if that would be welcome?

fcurella commented 4 months ago

Thank you so much, that would be wonderful!

samueljsb commented 4 months ago

I've tried to do this but I'm having trouble with the stub files. They don't seem to match the type annotations I put in the actual Python modules (even after running the stub generation script) and manually editing them gets undone when I run the linters.

EDIT: the stub generation doesn't appear to be running in CI, so that isn't a blocker, but it does mean we run the risk of this regressing.

steve-mavens commented 4 months ago

If I'm still in time to catch this train - the same applies to uuid4(), which has a return type of str | bytes | UUID, and the actual type returned is similarly controlled by a function parameter. https://github.com/joke2k/faker/issues/2042

fcurella commented 4 months ago

the generate_stubs.py script will need to be updated to detect overloads, but typing.get_overloads() is only available in Python 3.11, and we need to support at least 3.8.

steve-mavens commented 4 months ago

It's in typing_extensions if that's any help for pre-3.11. The trouble is it that pre-3.11 it only detects overloads defined using typing_extensions.overload, not typing.overload.

fcurella commented 4 months ago

I dont see any issues with falling back to typing_extension on <3.11. Do you want to try to update generate_stubs.py to use get_overloads()?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 14 days since being marked as stale.