holoviz / param

Param: Make your Python code clearer and more reliable by declaring Parameters
https://param.holoviz.org
BSD 3-Clause "New" or "Revised" License
412 stars 69 forks source link

Expanded range info shown in HTML repr #821

Closed jbednar closed 9 months ago

jbednar commented 10 months ago

This PR greatly expands the HTML repr's display of the range accepted by each Parameter:

Here's the new behavior:

import param

class q(param.Parameterized):
            z = param.Selector(objects=[1, 2], check_on_set=False)
            x = param.Number(allow_None=False)
            w = param.Number(default=1, allow_None=True)
            y = param.Number(default=None, doc="d")
            v = param.Number(default=1)
            u = param.Number(default=1, step=0.5)
            a = param.Number(default=lambda: 1)

class P(param.Parameterized):
            a = param.Selector(objects=[1, 2], check_on_set=False, doc="""
                This is a fairly long docstring that conveys little""")
            b = param.Number(allow_None=False)
            c = param.Number(default=1, allow_None=True)
            d = param.Number(default=None, doc="d")
            e = param.Number(default=1, readonly=True)
            f = param.Number(default=1, step=0.5)
            g = param.Number(default=lambda: 1)
            h = param.Number(default=1, bounds=(0, 2))
            i = param.Integer(bounds=(-1, 1), constant=True)
            j = param.Integer(bounds=(-1, 1), inclusive_bounds=(False, True))
            k = param.Integer(bounds=(-1, 1), inclusive_bounds=(True, False))
            l = param.Integer(bounds=(-1, 1), inclusive_bounds=(False, False))
            m = param.Integer(bounds=(-1, None))
            n = param.Integer(bounds=(None, 1))
            p = param.ClassSelector(default=q(y=3), class_=q)
            s = param.String("s t r")
            t = param.String("a22", regex="[a-z0-9]*")
p=P(name="V1")
p.param
image
jbednar commented 10 months ago

For comparison, here's the old behavior:

image
jbednar commented 10 months ago

Looking at more examples, I've found that the range for a List parameter is a bit confusing, displaying bounds that for a List defines a range of number of elements the list can contain, ignoring List.item_type which is probably more used.

Good point. Maybe we should use Python 3 type hint notation and put the type into the "Type" column? I.e. List[float]? Same for Dict and similar container Parameters?

in the Range column it'd be nice to display all/most of the constraints that are applicable to the Parameter, except the type as it can be inferred from the Parameter Type. If so, Range could be renamed Constraints?

Yes, that's what I was going for, to express as much of the constraints on the allowed values as I could. I was thinking of "range" in terms of mathematical functions, but I guess it's more "domain" in that sense, because it's a set of constraints on an input parameter. Not sure "Domain" will convey much to people, though. "Constraints" sounds pretty generic. "Allowed values"? Would be nice to have a single word. "Allowed"? Or, sure, "Constraints".

there should be a mechanism for Parameters to declare what they want to display in that column. This will be useful for Parameters created outside of Param (Panel has Aspect/Margin which now have no information in the Range column.

Yes, I had thought of that, but didn't want to bite off too much this close to release. I think we should add an extension mechanism like that, but it doesn't have to be right away.

I considered not showing a range for constant and read-only, since at least after instantiation users cannot set those, but on balance I think showing a range can still help people reason about how to use that parameter (e.g. a read-only parameter whose range is 1 to 10 can be treated differently from one with unbounded ranges.) I would be in favor of not showing the range/constraints for readonly. Understanding the difference between readonly and constant is pretty difficult, I think this could help a little.

Fine by me. @philippjfr or others, any vote? The questions are:

  1. Should a read-only Parameter show any range information? Reasons I can think of for No are based on a user of this object: the parameter is read only, and thus no user could ever change it, so it's distracting to list it. Reasons for Yes are based on a programmer writing code to work with this object: if a read-only Parameter is constrained to a narrow range, the code working with it can trust that range and doesn't have to deal with any other values.

  2. If we go with No for read-only, should we also go with No for instantiated Constant Parameters? The argument for No is similar to for read-only, in that for this instantiated object, the value cannot be changed, so showing the range is confusing. The argument for Yes is that sure, this particular value can't be changed, but you can always instantiate a new one, and when you do, this is the range that would apply.

  3. If we go with No for instantiated constant parameters, should we also go with No for the HTML representation for the class? Here I think the only argument for No (not showing range info) is to match cases 1 and 2. The argument for Yes is that the value is changeable at the class and new instance level, and users want to know that info.

My votes for showing range info in the cases are 1 (weak Yes), 2 (weak Yes), 3 (strong Yes). I think Maxime is arguing 1 (No), 2 (Yes), 3 (Yes).

Also I find that having readonly/constant in the Range column reads like they affect the range which is misleading, renaming the column name to Constraints or moving readonly/constant to the Type column would help I think.

If we go with No for 1 above, then "readonly" would be the only item in the range, which seems fine. But that doesn't solve "constant", and I agree it's a bit confusing to put that info into the range, but I don't want a separate column just for that, because it's rare for all parameters but name, and don't want to waste space on it. Moving it to Type would be ok, or sneaking it to the beginning of the doc field might also work (as if it's a note in the docstring.)

I don't consider any of these changes to be a blocker for Param 2.0, I expect the HTML repr to change quite a lot when we are going to start using it to build the API/reference docs of Panel/HoloViews.

Agreed. That said, these were what I considered the "easy" or non-controversial improvements (apart from including the docstring, which I am happy to back out), and I wanted to get them locked in before working on https://github.com/holoviz/param/issues/823. #823 is more important, because it strongly affects not just the details of what's in the HTML, but when it appears and how we use it. So this PR is meant to keep these more minor questions separated from #823. It's ok to punt on this one if it's turning out to be time consuming to review.

maximlt commented 10 months ago

I went back and forth reviewing this and trying to reply to your last comments @jbednar.

I first wondered whether we should actually display that range in the repr and if we were not conflating the repr with the help too much. Because Parameter attributes are actually writable, and they're of course pretty useful, I figured out having them in the repr is alright.

Actually, None is a different type, so I suppose we really ought to be putting it into the Type column! I.e. the type isn't really Integer, it's Integer | None! I

That works okay for the Integer Parameter but in my eyes would look weird for Parameters that don't map to a Python type, e.g. FileSelector | None.

str | None was suggested In the discussion on how to display the default regex of String, which made me realize that we quite naturally fall into relying on Python type hints, and that's not bad! As indeed Python users are being more and more used to see them. However more expressive type hints have become over the last Python versions, they still can't express all the constraints offered by Param Parameters (that is not entirely true, you can actually stuff arbitrary expression in the type hint leveraging Annotated, that's for instance relied upon by Pydantic, so you could display something like Annotated[str, regex('*.com')] | None). So my suggestion for the Range/Domain column would be to display first as much as possible information relying on type hints, followed by what could not be expressed using type hints:

Here's how that could look like, the first row being the pattern to follow, followed by a couple of examples:

Type Range
:ParameterType: :Type Hints:, :Attributes:
Integer int | float | None, bounds=(0, 10)
List list[int], constant=True, bounds=(2, 2)
ClassSelector type(dict) | type(tuple)

The work needed to make a Parameter to type hints has been started by Philipp in https://github.com/holoviz/param/pull/677. Because the data displayed can be quite long, it should be truncated above a given length, the full version displayed in a tooltip.

I changed my mind about readonly/constant(instance) not showing the range data, I wouldn't be surprised if there were cases where knowing the value and Parameter type wouldn't be enough information to infer the object type (thinking of ClassSelector). With my suggestion above readonly and constant are displayed in the attributes part of the data (note the existence of typing.Final that could be leveraged to express some form of read-only attributes, so for readonly=True and constant=True when displaying the repr of an instance, however I'm not sure there's a type hint to express that a class attribute can be set in the constructor but no longer modified).

jbednar commented 10 months ago

I'd like to move this forward; it's time to ship!

I agree it would be nice to use Python 3 type info when appropriate, particularly if we can start supporting that in Param itself, which we should. But that seems daunting, and I don't want to hold this up just to get to that.

In any case, I agree with @maximlt that I've been conflating the repr with the help all this time. For the help I'd like all the info. For the repr it should be compact, and I don't actually see much reason to have type, range, or doc in the repr, since the repr is meant to convey "what is this" not "how can I configure this". I deeply do want this fully detailed HTML representation, primarily for use in the docs, but it's really help, not a repr!

So for the repr, what if we transposed it and just show key:value, to keep it compact?

image
maximlt commented 10 months ago

I have to say I'm not much enthusiastic about the last proposal, the table being transposed will make it challenging to nicely render Parameterized objects equipped with lots of Parameters. It's also just a slightly fancier version of a dict repr. No really I'm afraid I don't see much benefit of going down that route.

I largely prefer the previous version (current on this branch):

image

To which I'd suggest the following changes:

I would like to preserve Type and Range, as without those I don't see much point in having an HTML repr at all.

We've already decided that for now the HTML repr would only be available on .param, not directly on an instance. This allows us to ship it and iterate until we're satisfied enough to possibly promote it to being displayed also on an instance, even a class.


Trying not to conflate the repr and help has been tricky indeed. I assume we were driven by the idea of having a pretty HTML display of a Parameterized object we can embed into our docs. Together with @droumis we recently chatted about the API/Component pages of Panel docs and how to improve them.

The pages of the Component gallery have been manually written, which even if it's a pain to maintain, has the advantage of being more user-friendly than a pure API reference. Take for instance the gallery page of the Button widget:

image

Given our requirements, it is pretty clear to me that the default HTML repr will not be a good fit. The approach we'd take either would be:

jbednar commented 9 months ago

Given our requirements, it is pretty clear to me that the default HTML repr will not be a good fit. The approach we'd take either would be:

re-use/adapt some Sphinx directive to automatically display a Parameterized class as we want for Param to provide some API to customize its HTML view so that it can be customized enough to meet our requirements.

I strongly vote for the latter, such that our Sphinx setup can provide whatever customization that we need, but that our docs and what is available to the user at the CLI or notebook cell level (presumably through help(), assuming that can be made HTML) are at least roughly the same, so that throughout the docs we pull data directly from the objects, and simultaneously demonstrate to the user how to pull data directly from the objects. I.e. the docs should have the comprehensive info (filtered as we wish) and should consistently show the users how to get that info from whatever object they have. That way when they read our docs, the info is real and complete and up to date, while people walk away from the docs knowing how they can get that same type and level of info when they are working interactively.

maximlt commented 9 months ago

I've just pushed a number of changes:

image
jbednar commented 9 months ago

Seems like nullable matches the formatting of readonly and constant better, and is an adjective rather than a verb. A verb makes sense when defining a Parameter, but less so in this context. Anyone have a preference between nullable and _allowNone?

jbednar commented 9 months ago

Is it helpful to show regex(.*)? I'm not seeing any case above where that adds information; can we just omit that? Same for (-Inf, Inf); can't we just omit that? In both cases that's the default and what people should be expecting anyway.

maximlt commented 9 months ago

I have no preference between nullable and allow_None.

Indeed the default regex and (-inf, inf) are not useful.

jbednar commented 9 months ago

Implemented those two suggestions and changed the alignment, both of which I think improve readability:

image image image
jbednar commented 9 months ago

Ok, last question before merging! In a289244 I switched from mathematical range notation like "[0,2)" to something closer to the Python expression you'd use to test it, like ">=0, <2". The math version is prettier but the code version is probably clearer for programmers, plus for the most common bounds case (non-negative values) it seems highly readable: "nullable >=0" or just ">=0"). I don't have a strong preference, so I'm happy for @maximlt or @philippjfr to merge it as is or to revert a289244 and then merge. Go for it!

image image image
jbednar commented 9 months ago

BTW, the formatting for embedded Parameterized objects isn't great, but I don't know if we can do much about that given how HTML details/summary works (note the stray "ClassSelector" overlapping from the next column):

image
maximlt commented 9 months ago

Thanks @jbednar this looks good to me!