has2k1 / plotnine

A Grammar of Graphics for Python
https://plotnine.org
MIT License
3.89k stars 209 forks source link

geom_boxblot width incorrect if number of subsets varies #772

Closed hisermann closed 2 months ago

hisermann commented 2 months ago

Hi, I have a bug where the width of boxplots is wrong if I have different amounts of boxplots per x value (see example and image).

Here is a minimal example:

import pandas as pd
import plotnine as p9

columns = {"x":["A", "B", "B", "A", "B", "B"], "y": [*range(6)], "fill": ["a","a","b","a","a","b"]}

df = pd.DataFrame(columns)

(
    p9.ggplot(df, p9.aes(x="x", y="y", fill="fill"))
    + p9.geom_boxplot()
).draw(1)

This generates three boxplots, one at x = "A" and two at x="B". But the boxplot at A is narrow, and the two boxblots at B are wide (see image), where I would expect it the other way around. It works as intended if I have 2 boxplots at A and one at B.

boxplot_example

I use plotnine 0.13.4 on ubuntu 22.04 with python 3.10.12