pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.55k stars 17.89k forks source link

ENH: Is there a way to plot a long format data in matplotlib without pivoting the table ? #59953

Open infinity-void6 opened 2 weeks ago

infinity-void6 commented 2 weeks ago

Feature Type

Problem Description

Problem

I am currently reading this book called Hands On Data Analysis using Pandas by Stefie Molin. There are two formats of data, wide and long format data. The author uses pandas and matplotlib to plot a wide format data while uses seaborn package for the long format data. I tried searching in the web and it seems to be the custom. I tried asking gpt as well, and I can plot the long format data without seaborn too but it seems that I have to pivot the dataset. Is there a way around it .

Wide Data Frame Sample

date TMAX TMIN TOBS
2018-10-28 8.3 5.0 7.2
2018-10-04 22.8 11.7 11.7
2018-10-20 15.0 -0.6 10.6
2018-10-24 16.7 4.4 6.7
2018-10-23 15.6 -1.1 10.0

Long Data Frame Sample

date datatype value
2018-10-01 TMAX 21.1
2018-10-01 TMIN 8.9
2018-10-01 TOBS 13.9
2018-10-02 TMAX 23.9
2018-10-02 TMIN 13.9
2018-10-02 TOBS 17.2

Long Data Frame after pivoting

image

plot command for wide df ax = wide_df.plot( x='date', y=['TMAX', 'TMIN', 'TOBS'], figsize=(15, 5), title='Temperature in NYC in October 2018' ) plot command for long df after pivot ax=long_df.pivot(index='date',columns='datatype',values='value') and apply a similar command as above

plot command for long_df with seaborn ax=sns.lineplot(data=long_df,x='date',y='value',hue='datatype')

Why isn't there a hue parameter or something similar in pandas for a long data format? My question can also be framed this way, " Why is pandas not enough for plotting? Why do I need external packages like matplotlib and seaborn to plot pandas data structure?"

Forgive me for my ignorance but I really want to know why cann't the features available in pandas and seaborn be available in pandas.

Feature Description

Lets start with a hue feature in pandas for a long data format

Alternative Solutions

we might have to pivot the table if we have to plot without using seaborn if we just need to use pandas

Additional Context

No response

rhshadrach commented 2 weeks ago

" Why is pandas not enough for plotting? Why do I need external packages like matplotlib and seaborn to plot pandas data structure?"

pandas plotting uses matplotlib by default. You can change the backend to use different packages.

https://pandas.pydata.org/docs/dev/user_guide/visualization.html#plotting-backends

Does this answer your question?

mfebrizio commented 1 week ago

My two cents as a pandas user for a ~5 years: