Open jbednar opened 2 years ago
The inconsistencies disappear if you define hm = hv.HeatMap(data).sort(by='y')
.
Does calling sort()
is expected to yield the same result?
BUT, if we don't sort hm
, they are some inconsistencies with the matplotlib backend.
The graphic and values are correct, but the letters of the y-axis are incorrect. Besides, the shared_axes
option is not taken into account when displaying layout
in the following code (this may be tagged as another bug).
import numpy as np, holoviews as hv
import pandas as pd
hv.extension('bokeh', 'matplotlib')
data = [(i, chr(97+j), i*j) for i in range(5) for j in range(5) if i!=j]
hm = hv.HeatMap(data)
df = pd.DataFrame(data,columns = ['x','y','val']).set_index(['y','x']).sort_index()
hm2 = hv.HeatMap(df,['x','y'])
df.unstack().sort_index(ascending=False)
The data in the dataframe df
is
val | |||||
---|---|---|---|---|---|
x | 0 | 1 | 2 | 3 | 4 |
y | |||||
e | 0.0 | 4.0 | 8.0 | 12.0 | NaN |
d | 0.0 | 3.0 | 6.0 | NaN | 12.0 |
c | 0.0 | 2.0 | NaN | 6.0 | 8.0 |
b | 0.0 | NaN | 2.0 | 3.0 | 4.0 |
a | NaN | 0.0 | 0.0 | 0.0 | 0.0 |
The bokeh graph with a
at the top that looked suspicious is consistent with the fact that hm['y']
is
(array(['b', 'c', 'd', 'e', 'a', 'c', 'd', 'e', 'a', 'b', 'd', 'e', 'a',
'b', 'c', 'e', 'a', 'b', 'c', 'd'], dtype=object),
layout = (hm+hm2).opts(shared_axes = False)
hv.output(layout,backend = 'bokeh')
hv.output(layout,backend = 'matplotlib')
hv.output(hm2,backend = 'matplotlib')
At the moment when element.py calls get_data(self, element, ranges, style)
, the error is already present since it results in out of sync data
and yfactors
with yfactors
being ['b', 'c', 'd', 'e', 'a'].
I'll try to understand and fix this (maybe my first commit :))
Ok, the problem is that the data is accessed through hm.gridded.data
, i.e.
{'x': array([0, 1, 2, 3, 4]),
'y': array(['a', 'b', 'c', 'd', 'e'], dtype=object),
'z': array([[nan, 0., 0., 0., 0.],
[ 0., nan, 2., 3., 4.],
[ 0., 2., nan, 6., 8.],
[ 0., 3., 6., nan, 12.],
[ 0., 4., 8., 12., nan]])}
and the y-axis is made from hm.dimension_values('y',expanded=False)
which is array(['b', 'c', 'd', 'e', 'a'], dtype=object)
(notice that it is not in the same order as the y
value of hm.gridded.data
.
Here is a smaller and troubling exemple of the discrepancy.
data_ok = [(1,'b',0),(0,'c',1),(1,'a',2)]
data_nok = [(0,'b',0),(0,'c',1),(1,'a',2)]
h_ok = hv.HeatMap(data_ok)
h_nok = hv.HeatMap(data_nok)
h_ok+h_nok
Strangely, h_ok.gridded.data['y']
is ['b','c','a']
and h_nok.gridded.data['y']
is ['a','b','c']
(which results in the shuffled data since the correct indexing is not taken into account).
I see two options to fix this :
1) Making changes in `hm.gridded
so that it enforces that hm.gridded.data['y']
and hm.dimension_values('y',expanded=False)
are identical (this is the implicit hypothesis that was broken)
2) Making changes in get_data so as to correctly extract the data (make no implicit hypothesis on hm.gridded.data)
Any opinion on the way to go?
HoloViews 1.14.8 (current release) shows different output depending on the backend:
The Bokeh version looks suspicious to me. Off by one error in indexing the y dimension?