pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
42.62k stars 17.58k forks source link

DOC: Plotting Backend Documentation Incorrect #58251

Open WillAyd opened 2 months ago

WillAyd commented 2 months ago

Pandas version checks

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html#plotting-backends

Documentation problem

The plotting backend documentation reads:

Some libraries implementing a backend for pandas are listed on the ecosystem page.

However, not every library on the ecosystem page implements the pandas plotting backend:

>>> df.plot(backend="seaborn")
ValueError: Could not find plotting backend 'seaborn'. Ensure that you've installed the package providing the 'seaborn' entrypoint, or that the package has a top-level `.plot` method.
>>> df.plot(backend="pygwalker")
ValueError: Could not find plotting backend 'pygwalker'. Ensure that you've installed the package providing the 'seaborn' entrypoint, or that the package has a top-level `.plot` method.

Suggested fix for documentation

I think the easiest fix would be to coordinate with the ecosystem libraries to ensure they implement the pandas plotting backend. If there are technical limitations to doing so, maybe we should put the libraries that do not implement the backend into a different section? Going forward we could also make it a requirement for libraries to implement the backend to land on the ecosystem page

@datapythonista for thoughts

WillAyd commented 2 months ago

FWIW pandas_bokeh throws an entirely different error when you use that as a backend:

AttributeError: unexpected attribute 'plot_width' to figure, similar attributes are outer_width, width or min_width
rhshadrach commented 2 months ago

Going forward we could also make it a requirement for libraries to implement the backend to land on the ecosystem page

This sounds like a high bar for being on the ecosystem page compared to where I think the bar has been in past discussions.

What do you think of just putting (with some kind of color / bold highlighting) something like

This package implements the pandas backend and can be used with pandas plotting methods [see here] This package does not implement the pandas backend and cannot be used with pandas plotting methods

WillAyd commented 2 months ago

Sure sounds good too. Open to anything that makes it clearer - right now it's pretty challenging to understand how the backend argument for plots works and which libraries even should support it.

I suppose could also question if we really need the backend argument for plots. I am under the impression that seaborn is one of the larger libraries, and people would just use that API directly.

Aloqeely commented 2 months ago

What do you think of just putting (with some kind of color / bold highlighting) something like

This package implements the pandas backend and can be used with pandas plotting methods [see here] This package does not implement the pandas backend and cannot be used with pandas plotting methods

But this will increase the maintenance burden of the ecosystem page, will we be checking whether each package implements the plotting backend every X period? And will we be checking which code samples run correctly and which libraries even work? The impression I get is that the page is community maintained.

As Mr. Ayd said, we should probably remove the backend argument. We could also stop linking to the ecosystem page in the plotting-backends doc, but then how will people know what to pass to the backend argument? So the better idea is to remove it.

rhshadrach commented 2 months ago

As Mr. Ayd said, we should probably remove the backend argument.

I'm quite negative here, being able to use the plotly backend is a good feature in my opinion, and one we should not remove.

I do not find the maintenance of specifying which packages support the backend concerning at all.

WillAyd commented 2 months ago

I didn't go through every backend but from sampling I could only get hvplot and plotly to work as a backend argument. And as mentioned before seems like pandas_bokeh has that intention but is broken.

I do not find the maintenance of specifying which packages support the backend concerning at all.

Its not the most important thing to solve, but its unfortunate we offer that without any integration testing. Maybe worth a separate initiative to add tests for at least plotly