plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
15.63k stars 2.51k forks source link

Support for Pandas Time spans as index col (PeriodIndex) #2764

Open mkschulze opened 3 years ago

mkschulze commented 3 years ago

Hello,

I'd like to use the pandas PeriodIndex date format to work with and calculate things in a time series way.

The manual says that Plotly auto-sets the axis type to a date format when the corresponding data are either ISO-formatted date strings or if they're a date pandas column or datetime NumPy array. (https://plotly.com/python/time-series/)

So I guess, since a PeriodIndex isn't of the concept "date" but of the concept "Time spans", Plotl.ly isn't able to auto-set this axis type? The error I get is a TypeError: Object of type Period is not JSON serializable

I know I can solve this with converting to another format (#df.index = df.index.astype(str)), but it would be great to have a native solution inside plot.ly that can understand PeriodIndex.

Here is my example code:

import pandas as pd
import plotly.express as px

### READ IN DATA ###
d = {
    'one':
    pd.Series([1., 2.],
              index=[
                  'Jan/2020', 'Feb/2020', 
              ])
}

df = pd.DataFrame(d)

print(f'\n# this is what I get from my datasource\n')
print(df.index)
print(df)

### CONVERT DATA ###
idx = df.index
df.index = pd.to_datetime(idx, format="%b/%Y").to_period(freq='M')

print(f'\n# after conversion\n')
print(df.index)

df.index.name = 'Monat'
print(df)

df = df.sort_index()

###THIS FIXES IT, BUT :/ ###
#df.index = df.index.astype(str)

fig = px.line(df, x=df.index, y=('one'), title='Testplot')
fig.show()
nicolaskruchten commented 3 years ago

This came up in the forum recently as well: https://community.plotly.com/t/plotly-doesnt-auto-set-the-axis-type-to-a-date-format/44632

Right now the underlying Javascript library only understands instants in time, not periods, so converting to strings or datetimes is the recommended approach.

nicolaskruchten commented 3 years ago

That said, we'll soon be able to display period data like this, using the upcoming xaxis.xperiod attribute, but the input format will still be full "instant in time" dates. We could add logic to convert Pandas periods to the first instant in the period, but I'm not sure if that would cause some strange behaviours downstream...

mkschulze commented 3 years ago

Ok well, I'm not so sure if it is much of an issue actually. I could still make all my calculations in the application with pandas using Pandas periods and only for displaying convert it to a str then right? I'm still novice level so to say, hence my question. Or would you see a benefit of adding such logic?

mkschulze commented 3 years ago

I mean Pandas has extensive functionality for time series data, but there are other issues with Pandas. So, it might be a benefit to do such analysis tasks within Dash components, right?

Maybe this could help for inspiration: https://www.kite.com/blog/python/pandas-time-series-analysis/

gioxc88 commented 1 year ago

still open after 3 years?

FlorinAndrei commented 10 months ago

This is annoying. Matplotlib can handle this without any issues.

gioxc88 commented 1 month ago

hello please any update on this?

Coding-with-Adam commented 1 month ago

Hi @gioxc88 We are in the process of cleaning up old issues and seeing how to move forward with them. We hope to have an answer in a few weeks.