Open KinzigFlyer opened 1 month ago
take
Thanks @KevsterAmp for looking into this
All good, just waiting for this issue to be triaged by a maintainer before working on it.
@KinzigFlyer - seems like this is a problem on matplotlib not on the pandas itself
I don't think so, Matplotlib does not provide stacking on it's own. You create a stacked bar by providing "bottom" parameter to the bars. So I think bottom is calculated inside Pandas. see this page in the official matplotlib documentation: Stacked Bar charts
Good point, thank you
I took the official programm, changed one of the Above values to 0 and added the bar-label. Works correctly.
import matplotlib.pyplot as plt
import numpy as np
# data from https://allisonhorst.github.io/palmerpenguins/
species = (
"Adelie\n $\\mu=$3700.66g",
"Chinstrap\n $\\mu=$3733.09g",
"Gentoo\n $\\mu=5076.02g$",
)
weight_counts = {
"Below": np.array([70, 31, 58]),
"Above": np.array([82, 0, 66]),
}
width = 0.5
fig, ax = plt.subplots()
bottom = np.zeros(3)
for boolean, weight_count in weight_counts.items():
p = ax.bar(species, weight_count, width, label=boolean, bottom=bottom)
bottom += weight_count
ax.bar_label(p, weight_counts['Above'])
ax.set_title("Number of penguins with above average body mass")
ax.legend(loc="upper right")
plt.show()
Converting the official example to a Pandas driven version shows the error:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# data from https://allisonhorst.github.io/palmerpenguins/
penguins = pd.DataFrame.from_dict([
{'species': "Adelie", 'Below': 70.0, 'Above': 82.0},
{'species': "Chinstrap", 'Below': 31.0, 'Above': 0.0},
{'species': "Gentoo", 'Below': 58.0, 'Above': 66.0},
]).set_index('species')
width = 0.5
ax2 = penguins.plot.bar(stacked = True)
ax2.bar_label(ax2.containers[-1], penguins['Above'])
ax2.set_title("Number of penguins with above average body mass")
ax2.legend(loc="upper right")
plt.show()
Thanks for the report - PRs to fix are welcome!
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
if the top part of the stacked plot has data value 0, the bar-label does not appear on top, but at the bottom of the bar.
Further debugging shows that all bars with data = 0 have their y position set to 0.0. They should have the top of the bar below as their bottom = y.
Expected Behavior
Bar-Labels should be positioned on top for all stacks.
this behaviour can be produced by correcting the y positions of the defective bars
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.11.9.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 186 Stepping 2, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : de_DE.cp1252
pandas : 2.2.2 numpy : 2.0.1 pytz : 2024.1 dateutil : 2.9.0.post0 setuptools : 65.5.0 pip : 24.2 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : 8.26.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.9.1 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.14.0 sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None