holoviz / holoviews

With Holoviews, your data visualizes itself.
https://holoviews.org
BSD 3-Clause "New" or "Revised" License
2.68k stars 402 forks source link

Make Holoviews docstrings meaningful for users #5361

Open MarcSkovMadsen opened 2 years ago

MarcSkovMadsen commented 2 years ago

Request

Add docstrings that can help inform the user and ease the learning curve.

Motivation

I am a user of Panel and sometimes HoloViews. I have a very hard time understanding HoloViews. It might be my own fault as I might be able to study the documentation. But that is not the way I work. I am working in my editor VS Code and use the intelligense/ tooltips and tab completion. I expect to have useful information available in my editor.

I find HoloViews very, very difficult to understand. The docstrings are not helpful at all. They seem to be written for and by the developers. Not the users of the framework.

Code Example

I am trying to help a user here https://discourse.holoviz.org/t/cannot-delete-boxedit-after-adding/3985.

import skimage
import numpy as np
import holoviews as hv

from holoviews import opts
from holoviews import streams
hv.extension('bokeh')
data = skimage.data.brain()
ds = hv.Dataset(
    (np.arange(256), np.arange(256), np.arange(10), data),
    ["x", "y", "num"],
    "count",
)
polys = hv.Polygons([])
box_stream = streams.BoxEdit(source=polys)

def stats(data):
    if not data or not any(len(d) for d in data.values()):
        return hv.Text(128, 128, "N/A")

    x0, x1, y0, y1 = data['x0'], data['x1'], data['y0'], data['y1']
    return hv.Text(128, 128, f"x0={x0}\nx1={x1}\ny0={y0}\ny1={y1}")

dmap = hv.DynamicMap(stats, streams=[box_stream])

im = ds.to(hv.Image, ['x', 'y'], dynamic=True)
ds.to()

plot = (im * polys + dmap).opts(
    opts.Curve(width=400, framewise=True), 
    opts.Polygons(fill_alpha=0.2, line_color='red'), 
    opts.VLine(color='black')
)

hv.Dataset

The docstring is simply incomprehensible. It does not answer the most simple questions like

It does not provide a reference example or a quick link to the documentation. I cannot use this for anything. And the hv.Dataset is the starting point for most things in HoloViews !!!

image

hv.Polygons

Wtf. It gives me the exact same docstring !!!

image

hv.DynamicMap

Wtf. It gives me the exact same docstring !!!

image

BoxEdit

There is no description of what I can use BoxEdit for. There is no description of any of the arguments I can see !!!

image

Dataset.to

What is this? How do I use it?

image

opts

This is the most useful docstring in the example. But it would be nice if it assumed even a bit less of the user. What is an option? And why are options separate?

image

opts.Curve

Thanks for nothing :-). When I see something so empty I am in doubt if I am looking at a class that is not important/ should not be used.

image

holoviews/ hv

This is the entry point. This is where all users will start. It has NO docstring!!!

What is this? How can it help me? What does a simple example look like? Where do I find more information?

image

MarcSkovMadsen commented 2 years ago

Another example is the Dataset.range. The docstring is there but needs more love for non-super users of the framework.

For example the dimension is this a number or a string name? More text and type annotations would be super useful. An example would be super useful.

If I have to find lookup things in the documentation, it gets so slow and gives me mental overhead.

image

jbednar commented 2 years ago

Adding docstrings more consistently to both HoloViews and Panel would be an excellent project for someone to take on. It's a big job, though; it would be nice to have some idea of the priority items that would help the most. The above seems like a few scattered items; is there some way to list the level where you have the most questions or the specific set of classes or files or functions that need the most love? Without that it's hard to get started.

It gives me the exact same docstring

I assume that means the docstring is missing and is inherited from a superclass.

MarcSkovMadsen commented 2 years ago

I spent maybe two 2 days on adding docstrings to Panel. And at least for me the experience of using Panel totally changed. Much easier to use. Much less mental overhead. So much more fun.

image

image

image

Having an intro, a reference example and a link to the extended documentation makes things so much easier.

I don't think this is an overwhelming effort to do for HoloViews. The documentation is probably there in the documentation notebooks, in other docstrings.

jbednar commented 2 years ago

Yes, for the concrete classes like Curve the docs are in notebooks, and I'd welcome PRs that express basic concepts from the notebooks as docstrings. The notebooks won't document abstract classes, but I guess there can't be that many abstract classes, so maybe @jlstevens could take a pass on those.

MarcSkovMadsen commented 2 years ago

The end user would not care about the documentation for the abstract classes. What matters is the documentation of the classes they use.

jbednar commented 2 years ago

Well, your other issues today were about abstract class docstrings, so...

MarcSkovMadsen commented 2 years ago

Nope :-) The problem is that concrete classes don't have specific, useful docstrings. Instead they shared the docstring of an abstract base class.

jbednar commented 2 years ago

Dataset (above) is an example of a base class that you're requesting docs for that needs an internal maintainer like @jlstevens to document; the documentation scattered in existing notebooks won't directly cover what it does.

MarcSkovMadsen commented 2 years ago

I've made a few example PRs to illustrate how I would do this and how useful it is. Please give me some feedback there. Thanks.

maximlt commented 2 years ago

As far as I know Dataset isn't an abstract class.

I agree that having better docstrings is always a good idea. What you did with Panel docstrings, i.e. embedding a link to the Reference Gallery page could also be done for HoloViews, I've found that useful.

It might be my own fault as I might be able to study the documentation. But that is not the way I work. I am working in my editor VS Code and use the intelligense/ tooltips and tab completion. I expect to have useful information available in my editor.

However, I might be wrong but you seem to say that the way you work is never to rely on any documentation website but only on what you can get in VS Code. I believe this is very unusual, and even if tooltips can prove practical in an editor like VSCode, there is no way you can get as much information in a tooltip compare to with a proper website (tooltips can't be that long otherwise they're unpractical, you can't search tooltips, etc.). Also, HoloViews is a library for visually exploring data, and documenting that is best done on a website (maybe one day you'll be able to get interactive Bokeh plot in a VSCode tooltip, not yet though). Lastly I have no data to back up this but my guess is that most users of HoloViews are in a Jupyter Notebook, which may not be the case for Panel. So docstrings changes made to HoloViews shouldn't make the experience of notebooks users worse.

I'd be interesting to investigate what docstring experience you get in VSCode with other viz libraries like seaborn and altair (7M downloads/month on PyPi).

jbednar commented 2 years ago

Right; Dataset is a base class, but not an abstract base class. It's useful as a concrete class that encapsulates certain functionality, and also important as the base class that implements that functionality for Elements. Thus many people need to know about it, and it's mentioned in the docs in various places, but if someone needs to know all about it, they go to the docstring and find almost nothing, and there's no reference manual notebook about it since it's not an Element or Layout. So it falls in a hole between our notebooks (all covering user-relevant objects) and abstract base classes (not needing notebooks since most people don't need to know about them).

I agree that the website is an important way to explore a graphical system like HoloViews, but I do sympathize with Marc's quest to have information right at his fingertips in an editor, just as I'd like to have such information right at my fingertips in Jupyter. I think having an informative docstring addresses both needs. It's true that in Jupyter we can probably handle much longer and richer docstrings, but no matter what, the current 3-sentence docstring is insufficient for such an important class.

jlstevens commented 2 years ago

Looking at Marc's examples, I do think the first Dataset example will always be a little confusing to explain due to its generality. That doesn't mean I don't think the docstring can't be improved, it can definitely be made more user friendly (and less developer focused).

I definitely think Polygons and other elements should have nice, consistent and user friendly docstrings. opts can be improved as you point out but things like opts.Curve are more tricky as they are dynamically generated based on the available elements.

In short, I think there are some easy wins in terms of docstring improvements in those examples screenshots (should be a relatively easy PR!), but some are a definitely trickier to implement.

maximlt commented 2 years ago

An easy win is having a link to the reference page in the docstring, this has been added by Marc to Panel and has proven to me pretty useful, Simon can confirm.

MarcSkovMadsen commented 2 years ago

I would be really happy to contribute links in docstrings to reference notebooks. For me they are a game changer in productivity when using Panel. Its such a joy to write some code and quickly jump to the reference example for more info.

MarcSkovMadsen commented 2 years ago

Here is another example where I trying to help a user https://discourse.holoviz.org/t/best-practices-for-interactive-layouts/4015

This use case would be similar to me reviewing or having to work with code created by colleague where I am not familiar with the framework or code used. In my work life we don't have the time to go read some web lengthy web documentation if we are just trying to change something slightly or fix a small bug.

The docstrings and type annotations of Holoviews does almost nothing to help me.

Dataset

Hmm. What is a dataset?

image

.select

Its relatively well documented. There is no example for categorical/ text data. That would have been nice. But ok

image

.to

My guess before seeing the docstring was that I will get ahv.Curve with kdims=_time and vdims=sig_value. But after reading the tooltip I'm not so sure any more.

image

Please don't return private objects. Please make docstrings useful for the end user and not the developer.

.overlay and .collate

A tooltip would maybe have told me what this does in seconds. Now I have to spend 1-10 minutes googling and looking up documentation.

image

image

Additional Comment

It is very, very important for productivity that you are able to easily read and understand the code.

Code is read more often than it is written. Guido Van Rossum.

MarcSkovMadsen commented 2 years ago

I tried looking at some Plotly examples. The deeply nested code I tested did have useful docstrings and type annotations.

MarcSkovMadsen commented 2 years ago

I looked at Altair. Its docstrings is of higher quality than Holoviews.

But it also has issues as it does not document its arguments in docstrings. And its type annotations are not perfect - Their api should be simple to annotate and provide tooltips for as they very often return self or self.copy(deep=True).

mycrickets commented 2 years ago

I would like to assist with this ticket, if possible. thanks.

MarcSkovMadsen commented 2 years ago

Hi @CMRicketts

Welcome to HoloViz. Sounds great. While waiting for suggestions from @philippjfr my suggestion would be that

But its up to you :-) Find something that you are really motivated to improve and start working on it. Do a small PR initially to get feedback and alignment. Then you can make a bigger PR.

Feel free to connect on LinkedIn https://www.linkedin.com/in/marcskovmadsen/?originalSubdomain=dk or Twitter https://twitter.com/home