pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.62k stars 17.91k forks source link

area plots are causing my unit tests to fail #9161

Closed argriffing closed 8 years ago

argriffing commented 9 years ago

I'm using the latest matplotlib and pandas. If you run the following code, do the legends look right to you? The unit tests don't like them.

df = DataFrame(
            np.random.rand(10, 3),
            index=list(string.ascii_letters[:10]))
df.plot(kind='area', subplots=True, sharex=True, legend=True)
rockg commented 9 years ago

What does the image look like? Here's what I get:

figure_1

argriffing commented 9 years ago

@rockg Thanks for posting an image! That is indeed what the unit tests expect, but the legends are screwed up for me in the following way. Each of the three sub-plots has two elements in the legend instead of one -- they are weirdly duplicated, and I possibly blame my new development branch installation of matplotlib.

If your legend in the first of your three sub-plots looks like this

+--------+
|        |
| ---  0 |
|        |
+--------+

then the one I see looks like this:

+--------+
|        |
| ---  0 |
|        |
| XXX  0 |
|        |
+--------+

Similarly for the other two subplots. I detected this problem by failing the pandas unit-tests, and I've extracted from the failing unit tests this minimal (or at least smaller) example.

tacaswell commented 9 years ago

My guess is https://github.com/matplotlib/matplotlib/pull/3303 is the cause.

argriffing commented 9 years ago

@tacaswell Just to be clear, you are guessing that https://github.com/matplotlib/matplotlib/pull/3303 caused the regression, not that it fixes the regression?

tacaswell commented 9 years ago

Correct, I suspect that pandas is working around the issue of poly collections not having a proper legend handler and now that it has one, you are getting double legend entries.

3303 is a major new feature :smile:

[edited to fix word-salad]

jreback commented 9 years ago

ok, so this is a reverse-bug fix, e.g. when matplotlib 1.5 is out, pandas needs to compensate?

tacaswell commented 9 years ago

Yes, but I think it should (eventually) result in simplification of your code.

Can someone point me to where the relevant bit of code is and I can try to provide said compensation.

jreback commented 9 years ago

@TomAugspurger cc @sinhrks

can you provide a link?

sinhrks commented 9 years ago

@tacaswell , @jreback Here it is. Maybe AreaPlot._add_legend_handle can be removed after 1.5? https://github.com/pydata/pandas/blob/master/pandas/tools/plotting.py#L1756