holoviz / holoviews

With Holoviews, your data visualizes itself.
https://holoviews.org
BSD 3-Clause "New" or "Revised" License
2.67k stars 398 forks source link

Proposal replacement for opts magics #3095

Closed jlstevens closed 5 years ago

jlstevens commented 5 years ago

Over the years we've had many proposals for how to specify options with the options system, many we have discussed and never implemented and many that we have. Now I am adding one more proposal that I think gets us 99% of where we want to be. :-)

We starting with the line and cell magics with multiple internal formats to represent the nesting, then we added __call__, then we renamed __call__ to opts, then added .options (flat namespace).

In addition we have the hv.opts utility which operates as a pure python replacement to the magics. It is safe to say that this is all quite confusing and we want one good way we can recommend and document (and eventually deprecate the other ways)

This all comes down to a fundamental problem with specifying nested structures to style nested, compositional objects. Even though we now encourage the use the .options method, we don't have a solution that satisfies all these goals:

  1. Pure Python syntax. Obviously magics fail at this test.
  2. Separation of presentation from content. Magics are good at this, with the options always found at the top of the cell. Currently, .options is often not as you need to refactor your code to apply .options at the right level.
  3. Tab completion. Magics are good at this, .options is not.
  4. Ideally something that lets us easily map only magics to the new approach (related to goal 2).

The closest thing to a recommendation right now is to declare a set of dictionaries at the top of the cell then applying .options appropriately. This satisfies goals 1 but fails at goal 3. Goal 4 fails as goal 2 is only partially satisfied as you need to factor out your code appropriately. On top of that, I consider it all quite ugly imho!

The current proposal is to extend hv.opts so it can be more a direct substitution for the magics. The aim is to improve our current recommendation (dictionaries + .option methods) with tab completion and compositionality so that you can just apply .options to a deeply nested object without having to refactor your code. In short, the proposal is based on these two ideas:

  1. You can use hv.opts to get an object that enhances the dictionary approach (i.e for use in .options)
  2. This object is queryable and compositional (could be the return value from the opts parser) and can be used by the .options method.

In its simplest form, instead of:

curveopts = dict(color='green')
imageopts = dict(cmap='fire')
Image(...).options(**imageopts) * Curve(...).options(**curveopts) 

you could use:

curveopts = hv.opts(color='green')
imageopts = hv.opts(cmap='hsv')
Image(...).options(**imageopts) * Curve(...).options(**curveopts) 

(Note that this is only technically possible as the use of keywords is a distinct signature from what hv.opts currently supports.)

Why do this?

  1. A dictionary could be used for anything, declaring hv.opts signals intent.
  2. Tab-completion of all keywords across all elements.

This style still suffers from a few problems 1) a large flat namespace for tab-completion 2) you still need to refactor your code into pieces to apply the option sets.

What if you want better tab-completion and also have a complicated nested object you want to apply options to i.e with specificity for element, group and label? What is the equivalent to?

%%opts Image.ORMap (cmap='hsv') Curve.Trajectory.Ball (color='green')
Image(...) * Curve(...)

In this proposal you could do something like:

curveopts = hv.opts.Curve('Trajectory', 'Ball', color='green')
imageopts = hv.opts.Image('ORMap', 'cmap='hsv')
(Image(...) * Curve(...)).options(curveopts | imageopts)

This style satisfies the first three goals, including tab-completion and avoiding having to apply .options piecewise. Note the use of | (inspired by the set union operator).

Although this style looks good to me for writing new code, there is the issue of naming your handles (i.ecurveopts and imageopts above). Note this is a problem we already face with the currently recommended dictionary + .options style of doing things.

The reason these handles are a pain is twofold:

  1. You may not want to have temporary handles that you don't have semantically meaningful names for (or you aren't bothered to think of ones).
  2. If we want to automatically convert the cell magics to this new style, an automated procedure won't be able to find good handle names that don't clash.

For this reason, I'm considering the idea that we could also support a method chaining approach that is equivalent to the | operator. This would reduce the need of n potential handles down to just one (called options below):

options = hv.opts.Image(cmap='hsv').Curve(color='green')
obj.options(options)

Here only the Image portion could tab complete correctly (but this wouldn't matter for an automated conversion from the magic syntax).

It is important to emphasise that the only reason to propose more ways of doing things is to slowly deprecate all the other ways (i.e document what we think is best). I really want new users see fewer ways of doing things.

As far as I can tell, this proposal allows for backwards compatibility, tab-completion and fewer constraints on the code structure for applying options. My main doubts are about naming the handles and whether the method chaining approach is worth supporting for this reason.

jlstevens commented 5 years ago

I'll also mention that this all works because we've standardized on the flat namespace for .options (ignoring the namespaces for plot, style, norm). This proposal could be extended to keep these namespaces e.g:

options = hv.opts.Image.style(cmap='hsv').hv.opts.Image.plot(invert_yaxis=True)

I do not propose we do this anytime soon as this extra level of nesting is currently not needed for as long as we use .options and it works correctly (and you would need to switch to using .opts anyhow). I suspect that the difference between this and earlier proposals is that we have decided to ignore one possible level of nesting by flattening between plot and style.

jbednar commented 5 years ago

Jean-Luc and I worked this proposal out on the whiteboard, and I think we're both happy that it satisfies all the goals we had. It seems to me that it offers a way that we could (semi) automatically transform our existing docs into pure Python (to avoid confusing new users) while giving even pure-Python users the ability to have tab completion. Hopefully @philippjfr agrees and we can finally address these issues!

jlstevens commented 5 years ago

One last thing is that I would personally like to move to a future where we don't need to keep qualifying opts with hv. In other words, I would like this style:

from holoviews import opts
curveopts = opts.Curve('Trajectory', 'Ball', color='green')
imageopts = opts.Image('ORMap', 'cmap='hsv')
(Image(...) * Curve(...)).options(curveopts | imageopts)

This is not something that would be possible converting old notebooks (opts is a common variable name right now) so I am happy qualifying the namespace for the time being. There would be no difference in this system between holoviews and geoviews so there would be no issue between hv and gv for instance.

jbednar commented 5 years ago

I usually prefer to leave "hv." in my calls so that it's clear even from a tiny snippet where things like "opts" come from, but I think other people are optimizing for different goals and so have a different opinion there.

(Image(...) * Curve(...)).options(curveopts | imageopts)

Presumably this call is the same as what I would think is clearer:

(Image(...) * Curve(...)).options(curveopts).options(imageopts)?

Personally, here I'd rather just pass them all in to .options() in such a case:

(Image(...) * Curve(...)).options(curveopts,imageopts)

Seems more self-evident than | does. | seems helpful mainly at the top, though I'd argue for + so that one could do:

import holoviews as hv
opts  = hv.opts.Curve('Trajectory', 'Ball', color='green')
opts += hv.opts.Image('ORMap', 'cmap='hsv')
(Image(...) * Curve(...)).options(opts)
jlstevens commented 5 years ago

(Image(...) * Curve(...)).options(curveopts).options(imageopts)

I think this should work anyway, but I strongly dislike this style.

(Image(...) * Curve(...)).options(curveopts,imageopts)

I'm not hugely fond of this style but I see some advantages (e.g makes it easier to partially apply styles when you get round to calling .options). The main issue is you still that you have to name various handles.

opts  = hv.opts.Curve('Trajectory', 'Ball', color='green')
opts += hv.opts.Image('ORMap', 'cmap='hsv')

Now this suggestion I like as it only involves one handle. That said |= also exists even if it less familiar to people! :-)

philippjfr commented 5 years ago

The core of this proposal sounds reasonable to me, but personally I strongly prefer this spelling:

(Image(...) * Curve(...)).options(curveopts, imageopts)

I've become a lot more skeptical about operator overloading and think it should be used very sparingly. If I had to vote between + and | I'd also vote for + but at the same time I think it invites confusion with the layout composition operator.

jlstevens commented 5 years ago
(Image(...) * Curve(...)).options(curveopts, imageopts)

I would be ok with this if you didn't have to come up with temporary handle names for all the components.

jlstevens commented 5 years ago

Just brainstorming. How about:

opts  = [hv.opts.Curve('Trajectory', 'Ball', color='green'),
         hv.opts.Image('ORMap', 'cmap='hsv')]
(Image(...) * Curve(...)).options(opts)

The idea is options can take opts which is a single one of these option set objects or a collection of them. I would probably prefer to make a set as the order doesn't matter but people then think it is a dict..

jbednar commented 5 years ago

If .options(curveopts, imageopts) is supported, that approach would work already if you spell it as .options(*opts). I'm not sure it's worth adding support for providing lists of option objects, given all the other formats supported...

jlstevens commented 5 years ago

I feel our default recommendation to people should avoid the use of * and ** and we don't have any list formats to confuse it with in .options.

More importantly, if the signature is based on *args then the base assumption is you are passing in N positional arguments (i.e handles, as we want the declaration at the top). I think this isn't a bad approach where we can support any collection of these objects (except dictionaries which would take more handling to disambiguate).

philippjfr commented 5 years ago

I have no strong opinions on supporting a list of these objects, although I feel like I would generally opt for giving them each a handle and passing them in as args.

jlstevens commented 5 years ago

.. although I feel like I would generally opt for giving them each a handle and passing them in as args.

These aren't exclusive options, I think we can still allow that style. The advantage of the list is for a more automated conversion of our old material where automatically generating handles is problematics. This list approach helps avoid that problem and I still don't think we should be using *opts much in our docs.

jbednar commented 5 years ago

Fine by me; the list approach is succinct and doesn't require line continuation characters or intermediate handles, and if it can be opts instead of *opts without being ambiguous, then great!

philippjfr commented 5 years ago

This has now been implemented in https://github.com/ioam/holoviews/pull/3173.