holoviz / hvplot

A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
https://hvplot.holoviz.org
BSD 3-Clause "New" or "Revised" License
1.14k stars 108 forks source link

Simple method for column transformations before plotting. #497

Open conodeen opened 4 years ago

conodeen commented 4 years ago

Hello,

I often need to plot slightly different versions of my data depending on who’s requesting data from me. I was hoping for a simple way of transforming columns such that I don’t have to keep altering my dataframes for every plot.

Here's an example use case:

Original Plot: probability as a decimal

demo_df = pd.DataFrame({‘value’:np.random.randn(50),‘probability’:np.random.rand(50)})
demo_df.hvplot.scatter(x=‘value’,y=‘probability’)

image


Desired: plotting the probability as a percentage instead. Something like this might be nice:

percent = hv.dim(‘probability’)*100
demo_df.hvplot.scatter(x=‘value’,y=percent,ylabel='probability_as_percentage')

image

jbednar commented 4 years ago

We're working on a general solution (see https://github.com/pydata/xarray/issues/3709), but that's not ready yet. @philippjfr may have something to suggest in the meantime, usually some wizardry with .apply()...

philippjfr commented 4 years ago

I guess I hadn't thought of using apply here, but I think hvPlot could actually explicitly support this by evaluating the dim transforms automatically. Here's what the apply approach looks like:

demo_df = pd.DataFrame({‘value’:np.random.randn(50),‘probability’:np.random.rand(50)})
percent = hv.dim(‘probability’)*100
demo_df.hvplot.scatter(x=‘value’,y=‘probability’).apply.transform(probability=percent)
conodeen commented 4 years ago

Yes, that works. Wonderful! Thank you.

philippjfr commented 4 years ago

The feature request is still valid.

conodeen commented 4 years ago

@philippjfr Is it very much different to apply dim transforms involving multiple columns in hvplot?

The following is a use case which throws a DataError... I'm trying to take a ratio of two columns p1/p2 (on y-axis) before plotting against the expected value (on the x-axis).

demo_df = pd.DataFrame({'value':np.random.randn(50),'p1':np.random.rand(50),'p2':np.random.rand(50)})
ratio = hv.dim('p1')/hv.dim('p2')
demo_df.hvplot.scatter(x='value',y='ratio').apply.transform(ratio=ratio)

I know the following works correctly in returning an array with the ratio correctly calculated...

ratio.apply(hv.Dataset(demo_df))

Just not sure how to feed it back to hvplot.

lhoupert commented 3 years ago

Hi @philippjfr and @conodeen , I am trying to do the same for a contourf plot. I want to reverse the y axis of this plot:

datatoplot['Data'].hvplot.contourf(z='V', x='refdist', y='depth')

Screenshot 2020-12-15 at 09 32 39

I followed your approach:

reversedepth =  hv.dim('depth')*(-1)
datatoplot['Data'].hvplot.contourf(z='V', x='refdist', y='depth').apply.transform(depth=reversedepth)

But I got this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-38-aee0d029292f> in <module>
----> 1 datatoplot['Data'].hvplot.contourf(z='V', x='refdist', y='depth').apply.transform(depth=reversedepth)

~/opt/anaconda3/envs/analysis_eel_data/lib/python3.9/site-packages/holoviews/core/accessors.py in transform(self, *args, **kwargs)
    288         kwargs['_method_args'] = args
    289         kwargs['per_element'] = True
--> 290         return self.__call__('transform', **kwargs)
    291 
    292 

~/opt/anaconda3/envs/analysis_eel_data/lib/python3.9/site-packages/holoviews/core/accessors.py in pipelined_call(*args, **kwargs)
     43 
     44             try:
---> 45                 result = __call__(*args, **kwargs)
     46 
     47                 if not in_method:

~/opt/anaconda3/envs/analysis_eel_data/lib/python3.9/site-packages/holoviews/core/accessors.py in __call__(self, apply_function, streams, link_inputs, link_dataset, dynamic, per_element, **kwargs)
    202             if hasattr(apply_function, 'dynamic'):
    203                 inner_kwargs['dynamic'] = False
--> 204             new_obj = apply_function(self._obj, **inner_kwargs)
    205             if (link_dataset and isinstance(self._obj, Dataset) and
    206                 isinstance(new_obj, Dataset) and new_obj._dataset is None):

~/opt/anaconda3/envs/analysis_eel_data/lib/python3.9/site-packages/holoviews/core/accessors.py in apply_function(object, **kwargs)
    169                                          'method exists on the object.' %
    170                                          method_name)
--> 171                 return method(*args, **kwargs)
    172 
    173         if 'panel' in sys.modules:

~/opt/anaconda3/envs/analysis_eel_data/lib/python3.9/site-packages/holoviews/core/data/__init__.py in pipelined_fn(*args, **kwargs)
    199 
    200             try:
--> 201                 result = method_fn(*args, **kwargs)
    202                 if PipelineMeta.disable:
    203                     return result

~/opt/anaconda3/envs/analysis_eel_data/lib/python3.9/site-packages/holoviews/core/data/__init__.py in transform(self, *args, **kwargs)
   1059         else:
   1060             new_data = OrderedDict([(dimension_name(d), values) for d, values in new_data.items()])
-> 1061             data = ds.interface.assign(ds, new_data)
   1062             data, drop = data if isinstance(data, tuple) else (data, [])
   1063             kdims = [kd for kd in self.kdims if kd.name not in drop]

AttributeError: type object 'MultiInterface' has no attribute 'assign'
lhoupert commented 3 years ago

I added a minimal reproducible example below:

data = np.random.rand(3, 4)
refdist = np.array(range(100,400,100))
depth = np.array(range(5,25,5))

ds = xr.Dataset(
    {   "V": (["refdist", "depth"], data),
    },
    coords={
        "refdist": (refdist),
        "depth": (depth),
    },
)

ds['V'].hvplot.contourf(z='V', x='refdist', y='depth')
# reverse the y axis
reversedepth =  hv.dim('depth')*(-1)
ds['V'].hvplot.contourf(z='V', x='refdist', y='depth').apply.transform(depth=reversedepth)
lhoupert commented 3 years ago

Just found that the new implementation of 'transforms' by @philippjfr fix the problem (https://github.com/holoviz/hvplot/pull/526):

In my case:

reversedepth =  hv.dim('depth')*(-1)
ds['V'].hvplot.contourf(z='V', x='refdist', y='depth', transforms=dict(depth=reversedepth))

When reading the documentation it was not really clear to me how to use the transforms argument.

Do you think it is worth adding a subsection about this in the user guide (https://hvplot.holoviz.org/user_guide/index.html) and holoviz tutorial (https://holoviz.org/tutorial/Basic_Plotting.html) ?

philippjfr commented 3 years ago

Yes I'd love to see a new section in the hvPlot user guide about this.

lhoupert commented 3 years ago

Great I will think about a simple example.