igraph / python-igraph

Python interface for igraph
GNU General Public License v2.0
1.32k stars 249 forks source link

feat: provide type stubs #450

Open adriangb opened 3 years ago

adriangb commented 3 years ago

This is a C wrapper, so understandably a lot of the arguments and types are unknown to Python. This can be solved (at least for IDE users) by providing type stubs, which are basically like Python header files. This benefits users because:

  1. The can run static analysis on a codebase containing this library.
  2. They can use the IDE's autocompletion instead of constantly having to refer to the docs.

This can even be done incrementally, i.e. ship one class at a time, it's not an all or nothing thing. And any mistakes don't change runtime behavior, so it's okay to just ship "this returns an itrable of any" and then later come back and say "it's a list of xyz".

iosonofabio commented 3 years ago

This is a fair request.

However, I suspect we are all too busy to actually implement this - not to mention maintain it. In fact, creating and maintaining hundreds of stubs by hand sounds quite daunting, because it is yet another moving part. For instance, if we change a function's signature, we currently have to:

If we go ahead with this manually, we'll have another item on that long list:

In additon, we are considering revamping the whole Python interface using a code generator such as pybind11 or SWIG. I'm wondering how this feature would survive that transition.

@adriangb are you aware of any software that can help with creation and/or maintenance of the stubs? I suspect that would make or break this feature request...

adriangb commented 3 years ago

I admit I'm not very knowledgeable in the realm of stubs for extensions, but I think there's some tools out there, e.g. for pybind11 https://github.com/pybind/pybind11/issues/2350

iosonofabio commented 3 years ago

For future reference:

@adriangb if you want to take a shot at exploring these possibilities, you could fork this repo and see, I'm sure that would greatly speed up the adoption of this feature. Otherwise, one of us might pick it up at some point in the future, but it might take a while.

adriangb commented 3 years ago

May give it a shot, not for the next couple weeks though. Thanks for considering the feature!

vtraag commented 3 years ago

If we use our own code generator Stimulus, we might be able to provide type hints directly in the generated code. That way we wouldn't need additional tools to generate separate stubs. But we'll first have to explore the code generation further to see if we can get Stimulus to work, and to see if that is actually the best choice.

ntamas commented 3 years ago

@adriangb Thanks for the feature request, this is something that I've personally had in mind for a long time now and I plan to implement it at some point in the next two years (this is the time period that is covered by igraph's recent CZI grant, the purpose of which is to update the higher-level interfaces to the standards of 2021). I don't know yet when I will be able to start working on it, but type stubs are definitely on my list. Any exploratory work that you (or anyone else) can do before I get my hands dirty is very welcome, even if a particular approach turns out to be a dead end as at least I will then know how not to do things :)

One thing that seems to be sure for the mid-term is that the Python interface will eventually switch from hand-crafted C bindings to generated ones. Generated code is not as flexible as hand-crafted code, though, so it seems like there will be a generated C-to-not-so-friendly-Python binding written in C, then another layer of not-so-friendly-Python-to-friendly-Python abstraction on top of that, written in Python, to cover the parts where the plain C wrapper is too unpythonic. Type annotations can then probably be added at the Python later and then generated C wrapper under the hoods can remain untyped as the user would not interact with that directly.

adriangb commented 3 years ago

the Python interface will eventually switch from hand-crafted C bindings to generated ones. Generated code is not as flexible as hand-crafted code, though, so it seems like there will be a generated C-to-not-so-friendly-Python binding written in C, then another layer of not-so-friendly-Python-to-friendly-Python abstraction on top of that, written in Python, to cover the parts where the plain C wrapper is too unpythonic. Type annotations can then probably be added at the Python later and then generated C wrapper under the hoods can remain untyped as the user would not interact with that directly.

That sounds reasonable. If you do switch to generated code, I agree that having a layer of abstraction would be a good call. Working directly w/ generated C code from Python is often a terrible user experience. Adding the type information in that abstraction layer would be ideal (probably better than stubs).

To be clear, the original request was for some sort of type annotation, be that stubs or in source code, it doesn't really matter.

That said, the nice thing about stubs though is that they let you "overwrite" entire APIs without changing the implementation, which can be used to declare an API for an untyped package (perhaps because it's generated from C) without writing an abstraction layer (which can also be good for performance). I believe the original use case was to provide typing information for CPython C parts without writing Python wrappers.