mwaskom / seaborn

Statistical data visualization in Python
https://seaborn.pydata.org
BSD 3-Clause "New" or "Revised" License
12.5k stars 1.92k forks source link

swarmplot fails when the numeric variable has an object dtype #873

Closed mwaskom closed 5 years ago

mwaskom commented 8 years ago
tips["tip"] = tips.tip.astype(np.object)
sns.swarmplot(x="day", y="tip", data=tips)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-185-711d8fde1561> in <module>()
      1 tips["tip"] = tips.tip.astype(np.object)
----> 2 sns.swarmplot(x="day", y="tip", data=tips)

/Users/mwaskom/anaconda/lib/python2.7/site-packages/seaborn/categorical.pyc in swarmplot(x, y, hue, data, order, hue_order, split, orient, color, palette, size, edgecolor, linewidth, ax, **kwargs)
   2691                        linewidth=linewidth))
   2692 
-> 2693     plotter.plot(ax, kwargs)
   2694     return ax
   2695 

/Users/mwaskom/anaconda/lib/python2.7/site-packages/seaborn/categorical.pyc in plot(self, ax, kws)
   1391     def plot(self, ax, kws):
   1392         """Make the full plot."""
-> 1393         self.draw_swarmplot(ax, kws)
   1394         self.add_legend_data(ax)
   1395         self.annotate_axes(ax)

/Users/mwaskom/anaconda/lib/python2.7/site-packages/seaborn/categorical.pyc in draw_swarmplot(self, ax, kws)
   1387         for center, swarm in zip(centers, swarms):
   1388             if swarm.get_offsets().size:
-> 1389                 self.swarm_points(ax, swarm, center, width, s, **kws)
   1390 
   1391     def plot(self, ax, kws):

/Users/mwaskom/anaconda/lib/python2.7/site-packages/seaborn/categorical.pyc in swarm_points(self, ax, points, center, width, s, **kws)
   1289         # We'll figure out the swarm positions in the latter
   1290         # and then convert back to data coordinates and replot
-> 1291         orig_xy = ax.transData.transform(points.get_offsets())
   1292 
   1293         # Order the variables so that x is the caegorical axis

/Users/mwaskom/anaconda/lib/python2.7/site-packages/matplotlib/transforms.pyc in transform(self, values)
   1308 
   1309         # Transform the values
-> 1310         res = self.transform_affine(self.transform_non_affine(values))
   1311 
   1312         # Convert the result back to the shape of the input values.

/Users/mwaskom/anaconda/lib/python2.7/site-packages/matplotlib/transforms.pyc in transform_affine(self, points)
   2345 
   2346     def transform_affine(self, points):
-> 2347         return self.get_affine().transform(points)
   2348     transform_affine.__doc__ = Transform.transform_affine.__doc__
   2349 

/Users/mwaskom/anaconda/lib/python2.7/site-packages/matplotlib/transforms.pyc in transform(self, values)
   1660 
   1661     def transform(self, values):
-> 1662         return self.transform_affine(values)
   1663     transform.__doc__ = Transform.transform.__doc__
   1664 

/Users/mwaskom/anaconda/lib/python2.7/site-packages/matplotlib/transforms.pyc in transform_affine(self, points)
   1744             tpoints = affine_transform(points.data, mtx)
   1745             return ma.MaskedArray(tpoints, mask=ma.getmask(points))
-> 1746         return affine_transform(points, mtx)
   1747 
   1748     def transform_point(self, point):

ValueError: object too deep for desired array

Not sure exactly, but an internal cast should avoid it...

dannykwells commented 7 years ago

Hi @mwaskom Any update on this? It's a blocking error. I'm not quite sure your diagnosis is correct, because the "planets" example still works, and the categorical there is an object too. Could it be a data size thing?

dannykwells commented 7 years ago

Actually resolved this temporarily (by not naming pandas columns??) but I don't know what's going on.

mwaskom commented 7 years ago

Just change the dtype of the column...

mwaskom commented 7 years ago

(Title was incorrect, the issue is with the dtype of the numeric/quantitative variable, which is clear from the example).

Jack-Lin-DS-AI commented 7 years ago

I was constructing a DataFrame and the column of y value was created as object dtype. I got the same error. After I changed the dtype by pd.to_numeric, the error disappeared.

MaozGelbart commented 5 years ago

Trying to reproduce with the following code:

import seaborn as sns
import numpy as np

tips = sns.load_dataset("tips")

tips["tip"] = tips.tip.astype(np.object)
sns.swarmplot(x="day", y="tip", data=tips)

And it works on my machine using seaborn 0.9.0. Perhaps not an issue anymore? Other packages used: numpy 1.16.2, pandas 0.23.0 and matplotlib 3.0.3 (all installed through conda with python 3.6.8 on linux).

mwaskom commented 5 years ago

Can confirm that the initial example works fine for me now, and not even with latest versions of libraries (matplotlib 2.2 and numpy 1.15), so will close.