pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.6k stars 17.9k forks source link

BUG: pandas.DataFrame.plot crashes, if subplots argument receives a touple. #59438

Closed lgi1sgm closed 2 months ago

lgi1sgm commented 2 months ago

Pandas version checks

Reproducible Example

import numpy as np
import pandas as pd

df = pd.DataFrame(
  np.random.rand(10, 3),
  columns=['A', 'B', 'C']
)

# Works:
df.plot(
  subplots=[
    ['A', 'B'],
    ['C']
  ]
)

# Crashes:
df.plot(
  subplots=[
    ('A', 'B'),
    ('C')
  ]
)

Issue Description

The df.plot() command will crash, if it receives tuples in a list as arguments, but the documentation states:

sequence of iterables of column labels: Create a subplot for each group of columns. For example [(‘a’, ‘c’), (‘b’, ‘d’)] will create (...)

Expected Behavior

do not crash.

Installed Versions

INSTALLED VERSIONS ------------------ commit : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140 python : 3.11.9.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.22631 machine : AMD64 processor : Intel64 Family 6 Model 186 Stepping 3, GenuineIntel byteorder : little LC_ALL : None LANG : DE LOCALE : German_Switzerland.1252 pandas : 2.2.2 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.9.0.post0 setuptools : 69.5.1 pip : 24.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.2.0 lxml.etree : 5.2.2 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.4 IPython : 8.25.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : 4.12.3 bottleneck : 1.3.7 dataframe-api-compat : None fastparquet : 2024.2.0 fsspec : 2024.3.1 gcsfs : None matplotlib : 3.8.4 numba : 0.60.0 numexpr : 2.8.7 odfpy : None openpyxl : None pandas_gbq : None pyarrow : 14.0.2 pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.13.1 sqlalchemy : None tables : None tabulate : None xarray : 2024.7.0 xlrd : None zstandard : None tzdata : 2023.3 qtpy : 2.4.1 pyqt5 : None
fbourgey commented 2 months ago

('C') is a str not a tuple

the correct syntax would be

df.plot(
  subplots=[
    ('A', 'B'),
    ('C',)
  ]
)

It works on my side.

mroeschke commented 2 months ago

Yes ('C') is still just a string. Closing as the expected behavior

lgi1sgm commented 2 months ago

Ah, thats very interesting thank you! I found the corresponding part in the documentation:

A special problem is the construction of tuples containing 0 or 1 items: the syntax has some extra quirks to accommodate these. Empty tuples are constructed by an empty pair of parentheses; a tuple with one item is constructed by following a value with a comma (it is not sufficient to enclose a single value in parentheses). Ugly, but effective. For example:

>>> empty = ()
>>> singleton = 'hello',    # <-- note trailing comma
>>> len(empty)
0
>>> len(singleton)
1
>>> singleton
>>> ('hello',)