plotly / plotly.py

The interactive graphing library for Python :sparkles: This project now includes Plotly Express!
https://plotly.com/python/
MIT License
16.08k stars 2.54k forks source link

Plotting DataFrame with timedelta64 (y-axis) #801

Open scls19fr opened 7 years ago

scls19fr commented 7 years ago

Similar to #799 but on y-axis and https://github.com/pandas-dev/pandas/issues/16953

import pandas as pd
from pandas.compat import StringIO
import plotly
import plotly.graph_objs as go

dat = """c1,c2,c3
1000,2000,1500
9000,8000,1600"""

df = pd.read_csv(StringIO(dat))

df = df.apply(lambda x: pd.to_timedelta(x, unit='ms'))

print(df)
print(df.dtypes)
print(df.index)

trace1 = go.Bar(
    x=df.index,
    y=df.c1,
    name='c1'
)
trace2 = go.Bar(
    x=df.index,
    y=df.c2,
    name='c2'
)

trace3 = go.Bar(
    x=df.index,
    y=df.c3,
    name='c3'
)

data = [trace1, trace2, trace3]
layout = go.Layout(
    barmode='group'
)

plotly.offline.plot({
"data": data,
"layout": layout
})

displays

capture d ecran 2017-07-22 a 12 35 57

y-axis values are not correctly displayed

scls19fr commented 7 years ago

A workaround is to do:

for col in df.columns:
    df[col] = df[col] + pd.to_datetime('1970/01/01')

capture d ecran 2017-07-22 a 12 42 25

but it will be nice if plotly.py could handle timedelta64 on y-axis

jaladh-singhal commented 4 years ago

Is there any progress regarding this, I really need to use timedelta64 on Y-axis?

nhoover commented 3 years ago

That's really not much of a workaround. Showing Jan 1, 1970 at the bottom... Timedelta is really a standard feature used all the time and plots with the y-axis being timedelta are very common.

harahu commented 3 years ago

For reference, there are two other issues closely related to this one, covering the x axis (#799), and color axis (#3368). It would probably make sense to tackle all three issues holistically, rather than separately.

ThomasGl commented 2 years ago

Hey, it's a bit late to the party. But, I wrote a solution for this issue. You can have dash, including its iterative features working and a datetime format in any sense. This, includes the autorender to solve for many x_sample points, so that it won't crash your axis.

Such is the following:

For the x_axis make absolutely sure that the format of your list is list[str] and they are consistent. ALSO, match the format in those strings in the tickformat of your list, the reference for time is standard as used in datetime objects, for future reference check: https://plotly.com/python/reference/layout/xaxis/#layout-xaxis-tickformat

e.g.:

mock_list = ["00:00:00", "00:00:01"]

mock_list to be the x values in a scatter plot for instance then adjust the axis as follows:

fig.update_xaxes( tickformat="%H:%M:%S")

dizcza commented 1 year ago

Thanks @ThomasGl. However, this works only for the last subplot in the figure.

The X axis is displayed in "%H:%M:%S". The 3rd (bottom) subplot hover X data is in "%H:%M:%S". But 1st and 2nd subplots hover X data is still in Jan 1, 1970, ...%H:%M:%S. How to make them also %H:%M:%S?

Tried

    for xaxis in range(1, 4):
        fig['layout'][f'xaxis{xaxis}']['tickformat'] = "%H:%M:%S.%f"

with no help.

ThomasGl commented 1 year ago

Thanks @ThomasGl. However, this works only for the last subplot in the figure.

The X axis is displayed in "%H:%M:%S". The 3rd (bottom) subplot hover X data is in "%H:%M:%S". But 1st and 2nd subplots hover X data is still in Jan 1, 1970, ...%H:%M:%S. How to make them also %H:%M:%S?

Tried

    for xaxis in range(1, 4):
        fig['layout'][f'xaxis{xaxis}']['tickformat'] = "%H:%M:%S.%f"

with no help.

Hi. Ill take at look at it over the weekend, but can you share a bit more of information upon the issue you are having?

ThomasGl commented 1 year ago

@dizcza also take note that you must adjust the axis for each subplot. As the engine responsible to generate the graphs renders each one as a new "fig" object with defaults params

dizcza commented 1 year ago

Here is the code I'm using:

import plotly.graph_objects as go
from plotly.subplots import make_subplots

def add_traces(fig, record_data_dict: dict):
    # only one key/value for now in this dict
    for sensor, record_data in record_data_dict.items():
        y = np.random.randn(1000, 3)
        # convert s to ms
        time_ms = (record_data.time * 1000).astype(np.int32)
        td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
        idx = np.arange(len(y)).astype(str)
        for dim in range(3):
            trace = go.Scatter(
                x=td,
                y=y[:, dim],
                hovertext=idx,
                name="AAA",
                legendgroup=sensor,
                showlegend=dim == 0,
                marker=dict(color=colors[sensor]),
                line=dict(color=colors[sensor]),
                opacity=0.8
            )
            fig.add_trace(trace, row=dim + 1, col=1)

 def plot_fig(record_dir=DATA_DIR / "2023.02.28"):
    fig = make_subplots(rows=3, shared_xaxes=True)
    record = Record(record_dir)
    add_traces(fig, record.data)
    fig['layout']['xaxis3']['title'] = "Time, s"
    fig.update_layout(
        title=record_dir.name,
        legend_title="Sensor",
    )
    fig.update_xaxes(tickformat="%H:%M:%S.%f")

and here is the plot Screenshot from 2023-03-03 18-58-59

The 1st and 2nd plots hover data is incorrect: it starts with Jan 1, 1970.

Screenshot from 2023-03-03 18-58-16

dizcza commented 1 year ago

@dizcza also take note that you must adjust the axis for each subplot. As the engine responsible to generate the graphs renders each one as a new "fig" object with defaults params

How can I do so? In my case, I have only one figure.

ThomasGl commented 1 year ago

Each subplot renders the engine plot for figure, in the sense that you have as many fig objects as you have subplots, thus in your case you have 4 fig objects, One containing subplots and then 3 as you have 3 subplots.

As for starting in Jan 1, 1970. this is a standard initial date, in case of missing compiling data, meaning, if you don't have a "DD:MM:YYYY" string like in the element responsible to render it, check in the documentation for the dash plots in case it changed, or it has some slightly different format. This could be generated using a list comprehension.

Yet, as for correcting timestamp, pass the line with fig.update_xaxes to the last line in the function add_traces

By the way, I can't know if theres an error with your data without the the function call arg to plot_fig, by that I mean that I need whatever DATA_DIR contains in order to recreate your plots

dizcza commented 1 year ago

All right, here is fully reproducible code:

import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots

def add_traces(fig):
    # only one key/value for now in this dict
    y = np.random.randn(1000, 3)
    time_s = np.random.rand(len(y)).cumsum()
    time_ms = (time_s * 1000).astype(np.int32)
    td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
    idx = np.arange(len(y)).astype(str)
    for dim in range(3):
        trace = go.Scatter(
            x=td,
            y=y[:, dim],
            hovertext=idx,
            name="AAA",
            showlegend=dim == 0,
            opacity=0.8
        )
        fig.add_trace(trace, row=dim + 1, col=1)

def plot_fig():
    fig = make_subplots(rows=3, shared_xaxes=True)
    add_traces(fig)
    fig['layout']['xaxis3']['title'] = "Time, s"
    fig.update_xaxes(tickformat="%H:%M:%S.%f")
    fig.show()

if __name__ == '__main__':
    plot_fig()

As for starting in Jan 1, 1970. this is a standard initial date, in case of missing compiling data, meaning, if you don't have a "DD:MM:YYYY" string like in the element responsible to render it, check in the documentation for the dash plots in case it changed, or it has some slightly different format. This could be generated using a list comprehension.

I understand that this is the standard initial date. But showing the date while hovering is not expected. I expect to have both the X axis and hover-on-data X values formatted to the %H:%M:%S.

Yet, as for correcting timestamp, pass the line with fig.update_xaxes to the last line in the function add_traces

Tried with no luck.

dizcza commented 1 year ago

Just try running this example and hover on the 1st, 2nd, and 3rd subplots, and you'll see the difference.

ThomasGl commented 1 year ago

I see, you only desire the xaxes in the "%H:%M:%S" to show up?

Ill run it tomorrow night

dizcza commented 1 year ago

I see, you only desire the xaxes in the "%H:%M:%S" to show up?

Correct. Not only the X axis (the bottom panel) but also X values when I hover the mouse over any subplot.

ThomasGl commented 1 year ago

import numpy as np import pandas as pd import plotly.graph_objects as go from plotly.subplots import make_subplots

def add_traces(fig):

only one key/value for now in this dict

y = np.random.randn(1000, 3)
time_s = np.random.rand(len(y)).cumsum()
time_ms = (time_s * 1000).astype(np.int32)
td = [time[len("0 days "):] for time in pd.to_timedelta(time_ms, unit='ms').astype(str)]
idx = np.arange(len(y)).astype(str)
for dim in range(3):
    trace = go.Scatter(
        x=td,
        y=y[:, dim],
        hovertext=idx,
        name="AAA",
        showlegend=dim == 0,
        opacity=0.8
    )
    fig.add_trace(trace, row=dim + 1, col=1)

def plot_fig(): fig = make_subplots(rows=3, shared_xaxes=True) add_traces(fig) fig['layout']['xaxis3']['title'] = "Time, s" fig.update_xaxes(tickformat="%H:%M:%S.%f") fig.show()

if name == 'main': plot_fig()

ThomasGl commented 1 year ago

@dizcza I am pretty sure this is what you were looking for? I didn't quite understand why you were adding pd.Timestamp("1970/01/01"), and be aware of the dash expect type for this operation to work.... it needs a List[str] object, where the string are already formatted.... e.g. for a "03:45:10" its expected a "%H:%M:%S"

dizcza commented 1 year ago

@ThomasGl thanks this is promising but the X axis labeling looks weird and not so intuitive in my original example. I mean it's much easier to look at Screenshot from 2023-03-05 08-08-16 than Screenshot from 2023-03-05 08-08-48

I didn't quite understand why you were adding pd.Timestamp("1970/01/01")

Because if I don't, I'm getting this:

Screenshot from 2023-03-05 08-10-14

Just like the author of this issue reported. And he added pd.Timestamp("1970/01/01") to workaround this. So do I.


Thanks for the effort though. I'm not sure which version I'll use: with "1970/01/01" upfront obfuscating the users or confusing X axis string labeling for each point.

ThomasGl commented 1 year ago

Hmmm. I mean yes it does look overcrowded a bit. Bit its because of the densuty of your data. When and if you zoom in you would see it fits better, again I suggest you look in the plotly documentation for the function behavior of update_xaxes() it might have some options on how to adjust the precision on which you see the xlabels. I am not sure how, as I didn't have to do it in my own projects.

Yet I hope I helped you understand a bit more and that you can carry on from here

nicolaskruchten commented 1 year ago

The core challenge here is that Plotly's date/time axes can only today represent specific absolute instants in time (e.g. March 5, 2023 at 8:13am UTC), and hence are incompatible with relative timedelta representations. Adding an absolute instant to such objects converts them to absolute instants, and by forcing the axis/hover displays to include only day-of-month/hour-of-day/minute etc information, you can hide the underlying absoluteness of the data point to an extent, but this has limits. For example if you add January 1, 1970 and your delta represents 32 days, then the "days" portion will be incorrectly displayed as 1 (i.e. February 1). More generally you will not be able to display times in formats like "200 minutes" or "26 hours and 4 minutes".

We are aware of these limitations in the library and would certainly undertake the development required to add relative time axes to the underlying Plotly.js library, but this would require external sponsorship.

nicolaskruchten commented 1 year ago

If you are mostly concerned with the hoverlabel, you can use the following single line to set the hovertemplate for all your traces to only include the h/m/s portion of the X value: fig.update_traces(hovertemplate="%{x|%H:%M:%S.%f}, %{y}")

dizcza commented 1 year ago

If you are mostly concerned with the hoverlabel, you can use the following single line to set the hovertemplate for all your traces to only include the h/m/s portion of the X value: fig.update_traces(hovertemplate="%{x|%H:%M:%S.%f}, %{y}")

Thanks @nicolaskruchten, I had trouble with hovertemplate language in the past that's why I had been avoiding templates till you showed me how to use them, and your solution works like a charm.

With these two hacks in mind, adding pd.Timestamp("1970/01/01") and hovertemplate="%{x|%H:%M:%S.%f}, %{y}", I was able to achieve what I want. At least from the user's perspective, all looks nice and shiny.

dizcza commented 6 months ago

@juandering sure, here it is

    td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
    trace = go.Scatter(x=td, y=...)
    fig = make_subplots(rows=3, shared_xaxes=True, vertical_spacing=0.03)
    fig.add_trace(trace, row=1, col=1)
    fig.update_xaxes(tickformat="%H:%M:%S.%f")
    hovertemplate = "%{x|%H:%M:%S.%f}, %{y}<br>point=%{hovertext}"
    fig.update_traces(hovertemplate=hovertemplate)
juandering commented 6 months ago

@juandering sure, here it is

    td = pd.to_timedelta(time_ms, unit='ms') + pd.Timestamp("1970/01/01")
    trace = go.Scatter(x=td, y=...)
    fig = make_subplots(rows=3, shared_xaxes=True, vertical_spacing=0.03)
    fig.add_trace(trace, row=1, col=1)
    fig.update_xaxes(tickformat="%H:%M:%S.%f")
    hovertemplate = "%{x|%H:%M:%S.%f}, %{y}<br>point=%{hovertext}"
    fig.update_traces(hovertemplate=hovertemplate)

Many thanks @dizcza.