has2k1 / plotnine

A Grammar of Graphics for Python
https://plotnine.org
MIT License
4k stars 213 forks source link

Category accessing bug in `add_margins` #663

Closed sandypreiss closed 1 year ago

sandypreiss commented 1 year ago

I get the following traceback when using facet_grid with margins = True.

File "/tmp/real/environment/lib/python3.7/site-packages/plotnine/ggplot.py", line 205, in draw
    self._build()
  File "/tmp/real/environment/lib/python3.7/site-packages/plotnine/ggplot.py", line 285, in _build
    layout.setup(layers, self)
  File "/tmp/real/environment/lib/python3.7/site-packages/plotnine/facets/layout.py", line 58, in setup
    self.layout = self.facet.compute_layout(data)
  File "/tmp/real/environment/lib/python3.7/site-packages/plotnine/facets/facet_grid.py", line 94, in compute_layout
    base = add_margins(base, [self.rows, self.cols], self.margins)
  File "/tmp/real/environment/lib/python3.7/site-packages/plotnine/utils.py", line 237, in add_margins
    categories[v] = col.categories
  File "/tmp/real/environment/lib/python3.7/site-packages/pandas/core/generic.py", line 5465, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'categories'

The variables used in the facetgrid are pandas categoricals. Outside of plotnines, df[var].cat.categories returns the categories as expected. The .cat accessor object (pandas.Series.cat) is needed to get the categories.

It seems to me that in the add_margins function, line 237 should be categories[v] = col.cat.categories rather than categories[v] = col.categories.

has2k1 commented 1 year ago

@sandypreiss, thanks for reporting. I am adding typechecks to the library and I caught this case and fixed it. It may be already in the main branch, if not, then in commits that I have not yet pushed.