pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.73k stars 17.95k forks source link

QST: DataFrame.plot() vs Series.plot() have inconsistency wrt reuse plot #59716

Open pollo-coder opened 2 months ago

pollo-coder commented 2 months ago

Research

Link to question on StackOverflow

https://stackoverflow.com/questions/72375976/odd-behavior-of-plotting-in-pandas

Question about pandas

I had asked this question on SO some time ago but got no satisfactory answer.

Essentially the question (slightly improved wrt what i asked on SO) was, after having imported necessary packages:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

why this code reuses the figure (so i end up with just one figure)

ser = pd.Series(np.random.randn(100))
fig = plt.figure()
ser.plot()

while this one does not (and i end up with two figures)

df  = pd.DataFrame(np.random.randn(100,3))
fig = plt.figure()
df.plot()

?

Then i debugged a little bit and noticed that this is caused by the following two lines

if isinstance(data, ABCSeries):
    kwargs["reuse_plot"] = True

in the PlotAccessor.call in pandas/pandas/plotting/_core.py around line 990

Why do we need those two lines? What is the logic behind? Could we just remove them or set kwargs["reuse_plot"] always to True? I do not see the reason for the inconsistency.

rhshadrach commented 2 months ago

Thanks for the report. Looks like this was introduced in #27009.

@datapythonista / @jorisvandenbossche - do you know why there is different behavior here for Series vs DataFrame? Ref: https://github.com/pandas-dev/pandas/pull/27009#discussion_r299048026

pollo-coder commented 1 week ago

hi @datapythonista @jorisvandenbossche what s your view? i d be v interested!