JohannesBuchner / imagehash

A Python Perceptual Image Hashing Module
BSD 2-Clause "Simplified" License
3.19k stars 329 forks source link

Please add type hints or type stubs #151

Closed Avasam closed 2 years ago

Avasam commented 3 years ago

Here's an autogenerated type stub. I've already filled in phash since that's the one thing I use: imagehash/__init__.pyi

JohannesBuchner commented 3 years ago

Can this be automatically generated, e.g., by setup.py?

JohannesBuchner commented 3 years ago

Also, please explain why this is needed.

Avasam commented 3 years ago

Type hints in Python are great for type safety. Python being a dynamically typed language, it's really easy to loose track of your variables' types, which leads to many runtime errors. With type hints and good linters/IDE tools (such as Pylance/Pyright, MyPy, PyLint, pycodestyle, etc.) these can not only be caught while coding, but also be caught and blocked at a CI (Continuous Integration) level.

There are two main ways to add type hints:

  1. Directly in code, but this breaks compatibility with earlier versions of python
  2. Using Type stubs, .pyi files. Which are basically type definitions (for anyone familiar with TypeScript, a statically typed superset of Javascript, this is akin to the .d.ts type definition files)

https://docs.python.org/3/library/typing.html https://www.pythonstacks.com/blog/post/type-hints-python/ https://stackoverflow.com/a/32558710

The file I provided was generated by Pylance, but most everything is typed as Unknown and needs to be manually filled in. there's a limit to what any tool can guess you meant for any variable to be. Just a quick look around, PyType (by Google) uses type inference and can generate .pyi files, could be a good start, but I have never used that specific tool myself:

JohannesBuchner commented 3 years ago

Do the docstrings need to be in the stub? If something is left out in .pyi, what are the consequences?

From what I see from the file, it could be possible to generate it with a small look up table (variable name -> type and function name -> return type).

Avasam commented 3 years ago

I don't think it does need to be there, no image

SpangleLabs commented 3 years ago

I would also like to +1 the niceness of type hints! But I avoided making a PR to add them here in the past, because of this library maintaining python 2.7 support.

I guess doing it in stub files means that'll all be fine though!

JohannesBuchner commented 3 years ago

I am wondering how to do this with the least amount of future maintenance effort. I am worried that the stub file may become outdated if not care is taken.

That's why I am wondering if it could be automatically generated, by parsing (perhaps even just text replacement) and injecting the types as needed. For example:

in `def __init__...:` the `:` should be replaced by `-> None:`
in `def __str__...:` the `:` should be replaced by `-> str:`
in `def __sub__...:` the `:` should be replaced by `-> int:`
in `def __eq__...:` the `:` should be replaced by `-> bool:`
in `def ...hash...(...):` the `:` should be replaced by `-> ImageHash:`
  except for __hash__, and crop_resistant_hash which returns ImageMultiHash.

within the brackets, the arguments can be split, and replaced by name matching:

image -> image: Image.Image hash_size -> hash_size: int = ...

An alternative is to keep the type annotations in comments in the source files just above each function signatures, and then extract these with a script into a type stub.

Another alternative is to use a program that can infer types. Perhaps https://google.github.io/pytype/ can do it?

nh2 commented 3 years ago

An alternative is to keep the type annotations in comments in the source files just above each function signatures, and then extract these with a script into a type stub.

You don't need to do that, you can add the type annotations directly into the Python source code. Python's type annotations are designed for that, and they are being ignored by the normal interpreter, and only used by typecheckers such as mypy.

I can very much recommend to use types, by the way. We found it made our Python code a lot more robust, and it makes libraries like this one easier to use for users (both when they are using a tool like mypy, and when they just read in the code what type a function argument has).

I've made the file from https://github.com/JohannesBuchner/imagehash/issues/151#issue-1058155790 more complete for my mypy use:

__init__.pyi.txt

JohannesBuchner commented 3 years ago

@nh2, imagehash is trying to support python2, which as far as I understand does not ignore these type annotations, but I may be wrong.

JohannesBuchner commented 3 years ago

aha, there is a way: https://mypy.readthedocs.io/en/stable/python2.html