holoviz / param

Param: Make your Python code clearer and more reliable by declaring Parameters
https://param.holoviz.org
BSD 3-Clause "New" or "Revised" License
422 stars 73 forks source link

Please support static typing #376

Open MarcSkovMadsen opened 4 years ago

MarcSkovMadsen commented 4 years ago

My Pain

I'm developing my applications in the VS Code editor. Furthermore I use pylint and mypy to help me create code of high quality. And google style docstring for documentation.

Param unfortunately does not work that well with that.

I've tried something like that.

class Scatter2dWithSelectionComponent(param.Parameterized):
    """The Scatter2dWithSelectionComponent enables a user to

    - provide a DataFrame of data

    Parameters
    ----------
    data : pd.DataFrame
        A dataframe of data to be shown and selected
    """
    data: pd.DataFrame = param.DataFrame(precedence=0.1)

But

Solution

Support static type checking and enable all the help you can get in modern editors.

jbednar commented 4 years ago

We have been focusing on simultaneously supporting Python 2 and Python 3 in the same codebases for the past 5 years, so we haven't been able to make use of any Python3-only syntax like type hinting in our own code. Once we are finally able to leave Python 2 behind, we can become experts in Python 3-only stuff, but for the moment I'm not actually sure what the above type annotation is meant to indicate. At the class level "data" is not a pd.DataFrame; it's a param.DataFrame object, so I'm not sure that's the correct type annotation to supply. But in the instance it will be a pd.DataFrame, which is what matters for instance.data.iloc.

I really don't know if Python 3's type hints are expressive enough to distinguish between type differences between the class and instance levels. So I can't say if this is something that is supported now but with a different syntax, if it can be supported but would require changes to the ParameterizedMetaclass, or if it is simply not possible given the type differences between Parameterized classes and instances (which are inherent to Param's design). Personally, I've found Parameter's type declarations to be vastly more powerful than static types (as they allow specifying ranges and allowed values and not simply type name), but it sounds like the fact that the type is static is something your editor can exploit, and of course the editor doesn't know about Param. Anyway, I wouldn't personally be ready to tackle this issue until we've completely left Python2 behind for good, but if someone else can see a way, be my guest!

MarcSkovMadsen commented 4 years ago

Thanks for the discussion.

I understand that that is a matter of resources and priority. I just wan't to raise it because it's important to the way I work and people working similarly.

I actually like param a lot. That is part of why I like Panel. I have no experiences from python 2 :-)

MarcSkovMadsen commented 3 years ago

I believe this link explains how to add type annotations to descriptors and thus to Param. https://stackoverflow.com/questions/57400478/using-typing-and-mypy-with-descriptors.

jbednar commented 3 years ago

I have no experiences from python 2 :-)

Lucky you! :-)

I'm happy for param 2 to move away from Python2.7 support. Param 2 can happen this summer; it would be the next big thing after we have proper docs (as it basically requires deleting whatever's not in the docs as being no longer supported!) I.e., it's not a big jump from where Param is now to 2.0; it's just an agreement to clean out a bunch of stuff and not look back. So yes, if we can make use of type hints at that point, great!

jbednar commented 2 years ago

https://atom.readthedocs.io/en/latest/basis/typing.html might be helpful for understanding how a similar library moved to start including py3 types. Param 2.0 is very close to release, and that's when we can drop py2 and consider how to move forward. I'd also be interested in seeing if there is some lightweight way to decorate a py3 dataclass to use its types in Param, as a limited but useful entry into Param's functionality using py3 type hints.

MarcSkovMadsen commented 2 years ago

For Param to be really easy to understand and use in a modern IDE lots of type annotations and just getting rid of all the *'params would be need.

Example - Number Parameter

The arguments of the Number.__init__ method would need 1) type annotations 2) docstrings describing the arguments

image

Example Integer Parameter

The __init__ method would need named arguments instead of the **params and 1) + 2) from above.

image

Would you be open to accept that kind of PRs @jbednar ?

philippjfr commented 2 years ago

For Param to be really easy to understand and use in a modern IDE lots of type annotations and just getting rid of all the *'params would be need.

Can you expand here, yes it might make declaring parameters easier but that's not really the important case right? Figuring out the types of the parameters themselves (like a dataclass) is what we should optimize for no?

As for this concrete question, I find the idea of repeating all arguments for a parameter whenever you subclass a parameter to be kind of crazy and ugly. Is there really no mechanism for Python typing modules to figure out types from the arguments accepted by the baseclass and the subclass?

philippjfr commented 2 years ago

Specifically why I find this so crazy and ugly is that it introduces a significant risk of argument definitions to become outdated/drift out-of-sync. Adding an argument on the baseclass which is then not also declared on all subclasses is a significant issue that is avoided by passing through **params.

maximlt commented 2 years ago

Would you be open to accept that kind of PRs @jbednar ?

Unfortunately this will have to wait for at least one more release though, i.e. until the moment it is declared that the next release will be Param 2.0.

As for this concrete question, I find the idea of repeating all arguments for a parameter whenever you subclass a parameter to be kind of crazy and ugly. Is there really no mechanism for Python typing modules to figure out types from the arguments accepted by the baseclass and the subclass?

I also find that a bit disappointing, I haven't yet found a solution to this. See this related discussion: github.com/python/typing/discussions/1079

philippjfr commented 2 years ago

Thanks for that reference @maximlt. I would have no particular objection to a @copy_type decorator as outlined in that issue.

maximlt commented 2 years ago

And two other (long) issues on typing for reference:

MarcSkovMadsen commented 2 years ago

declaring parameters easier but that's not really the important case right

I would say it is. You probably know the arguments of Parameter and any derived class by heart. But a newbie or one who does not use Param all the time does not. Its so much easier for them to understand Integer if the editor provides helpful tooltips telling that readonly is an available argument, what it does and what type is should be.

philippjfr commented 2 years ago

Totally fair and I'm not objecting at all to typing of the Parameter parameters. Just was attempting to clarifying the distinction between typing Param(eters) and typing of a Parameterized, which is what the issue was originally about.

maximlt commented 2 years ago

I agree with that too! Adding Numpy docstrings to each Parameter is something I would like to do, I was basically waiting for Param 2.0 to be the next milestone, to be able to refactor the whole code base and add docstrings all over the place.

MarcSkovMadsen commented 2 years ago

Is there really no mechanism for Python typing modules to figure out types from the arguments accepted by the baseclass and the subclass?

mypy provides a plugin mechanism. But unfortunately they only work when statically typing your code using mypy script.py.

MarcSkovMadsen commented 2 years ago

My guess is that when you define a custom __init__ function you have to provide all arguments and a good docstring.

How would mypy, pyright or other tools know that you did not intend to control the signature and docstring of the __init__?

At least that it the behaviour I see right now when running stubgen on a dataclass.

image

My conclusion is that the best future for param and the rest of the holoviz suite would be do be able to autogenerate stubs because then we can much better control and provide useful information. We have so much more useful information that can be autogenerated, than what I see for example PEP 681: Data Class Transforms providing providing.

MarcSkovMadsen commented 2 years ago

I have spent hours trying to understand the documentation, github discussions and other docs around typing. But as I see it you either need to be rather experienced in this area or maybe spend a month before you really understand what direction Param should take.

Do we know anybody who's an expert in this area?

philippjfr commented 2 years ago

One perspective that may help would around what @mattpap has done and intends to do, for typing of the Bokeh property system. That fairly closely mirrors what Param provides although with a focus on serialization.

MarcSkovMadsen commented 2 years ago

I would really, really like to find a solution on this. I can see how much it means for my productivity now that Panel ships with some typing support and especially much improved docstrings.

Before everything had to be in your head or looked up in the documentation. Now the documentation is there as a nice tooltip and if its not enough I can easily click the reference link. Its such a mental relief, its so easier to discuss the code with colleagues and it feels so much more modern.

image

jbednar commented 2 years ago

I would have no particular objection to a @copy_type decorator as outlined in that issue.

Same; using @copy_type to mirror types from a base class is a bit ugly, but it's acceptable because it is automatic and doesn't let the two definitions get out of sync. So I'd be ok with accepting PRs that add @copy_type to provide typing for constructor arguments in param, if there is a demonstrated benefit (which should be shown off in the PR's text).

philippjfr commented 2 years ago

Right now the only thing that can actually work would be the following:

  1. We add pytype properties to all parameters
  2. We add a utility that takes a Python module as input and rewrites it so that all parameters are annotated by type based on the pytype properties
  3. We provide a gh-action (and pre-commit hook) that checks that the parameter definitions match the type definitions

If we need to support an actual runtime checker things get a lot more involved because it will realize that param.Integer() is not actually of type int.

jbednar commented 2 years ago

If we need to support an actual runtime checker things get a lot more involved because it will realize that param.Integer() is not actually of type int.

Depends -- at the instance level it will actually be int.

philippjfr commented 2 years ago

Actually you're right, even at the class level that is true at runtime.

tomascsantos commented 7 months ago

Hi all, I've been scanning the issues and I'm trying to understand where we've landed with this? I really love the watcher / callback features of param, but it is a HUGE drawback that my code essentially requires users to read the source to understand how to use it.

Example of useless docstrings: image Or I have to use insanely redundant definitions to write self-documenting code: image

Since the above is obviously unmanageable, I resort to having terrible docstrings in all my code which make adoption incredibly difficult. I can see @MarcSkovMadsen has really been pushing to fix these docstrings, so I just want to add my vote and really stress how important this is. I cannot think of any other feature besides stability that would be as important as nailing these docstrings / type hinting.

To be clear the question here is where are we now? And I'll add that I'm hesitantly interested in chipping in some manpower if there is a concrete plan of action. Thank you!

LecrisUT commented 1 month ago

Hi all, I've been scanning the issues and I'm trying to understand where we've landed with this? I really love the watcher / callback features of param, but it is a HUGE drawback that my code essentially requires users to read the source to understand how to use it.

The attribute documentation is rather an issue with PEP224. PyCharm and sphinx support this. I have played around with trying to add native docstring to attrs and concluded it is impossible (the docstring is getting attached to the parent type instead if even possible). Another nice tool is dostring_parser which can extract at least the static docstrings, including PEP224 format.

Otherwise for type-annotation and docstring inheritance attrs is a good reference to study I would say. Hope these comments help.

LecrisUT commented 1 month ago

As I have been learning how to interact with param, I've been thinking of how to best tackle this issue. I think this needs to be separated in a few chunks. It would also help those who are subscribed to this issue track what's the progress on this.

Step1: Make the base types generic

First the basics of type-annotation should be implemented: adding py.typed, CI for mypy, adding everything to mypy.exclude. Including ruff to the CI would also help for contributors to maintain some consistency.

A first target to convert is param.Parameterized.Parameter to a Generic linked by default type.

Step2: Convert the reactive expression

By priority I would say this is lower than the next step, but this one is much easier to achieve, since it primarily revolves around tpye-hinting ._obj and .value.

Step3: Start exposing the Generic type's attributes

Stuff like __getattributes__, __setattr__ are valuable to expose for the IDE to process. I am not sure at this point how to expose these

Step4: Convert the Parametrized

This one might be the most complex one, e.g. annotating __init__ and such. It might also be good to consider dataclasses and attrs approach using decorators instead of class inheritance.