Performance regression in 3.0.0

iandanforth commented 6 years ago

OS: OSX 10.13.2 Python: 3.6.4

I'm creating the following animated chart using the offline API. plotly-animation

Using 2.7.0 generating the HTML file (30MB) takes ~11 seconds (0m11.044s) Using 3.0.0 generating the HTML file takes: 4 MINUTES (4m20.393s).

There is probably a much more efficient way of making this kind of animated scatter (please tell me if there is) but in 2.7 this was at least workable, in 3.0, not so much.

The following example reproduces the behavior. The size of the data is the same as in the gif above, the actual values are not.

import numpy as np
import plotly.graph_objs as go
import colorlover as cl
from plotly.offline import plot

sim_duration = 40.0
time_inc = 0.1
motor_unit_count = 120
times = np.arange(0.0, sim_duration, time_inc)

# Setting colors for plot.
potvin_scheme = [
    'rgb(115, 0, 0)',
    'rgb(252, 33, 23)',
    'rgb(230, 185, 43)',
    'rgb(107, 211, 100)',
    'rgb(52, 211, 240)',
    'rgb(36, 81, 252)',
    'rgb(0, 6, 130)'
]
# It's hacky but also sorta cool.
c = cl.to_rgb(cl.interp(potvin_scheme, motor_unit_count))
c = [val.replace('rgb', 'rgba') for val in c]
c = [val.replace(')', ',{})') for val in c]

def get_color(trace_index: int) -> str:
    # The first and every 20th trace should be full opacity
    alpha = 0.2
    if trace_index == 0 or ((trace_index + 1) % 20 == 0):
        alpha = 1.0
    color = c[trace_index].format(alpha)
    return color

# Per Motor Unit Force
all_array = np.ones((120, len(times)))
data = []
annotations = []
anno_offsets = {
    0: 20,
    19: 30,
    39: 40,
    59: 45,
    79: 17,
    99: 56,
    119: 170
}
max_y = np.amax(all_array)
for i, t in enumerate(all_array):
    trace = go.Scatter(
        x=times[:1],
        y=t[:1],
        name=i + 1,
        marker=dict(
            color=get_color(i)
        )
    )
    data.append(trace)

frames = []
for i in range(1, len(times), 10):
    frame_data = []
    for j, t in enumerate(all_array):
        trace = go.Scatter(
            x=times[:i],
            y=t[:i],
            name=j + 1,
            marker=dict(
                color=get_color(j)
            )
        )
        frame_data.append(trace)

    frame = dict(
        data=frame_data
    )
    frames.append(frame)

layout = go.Layout(
    title='Motor Unit Forces by Time',
    yaxis=dict(
        title='Motor unit force (relative to MU1 tetanus)',
        range=[0, max_y],
        autorange=False
    ),
    xaxis=dict(
        title='Time (s)',
        range=[0, sim_duration],
        autorange=False
    ),
    updatemenus=[{
        'type': 'buttons',
        'buttons': [{
            'args': [
                None,
                {'frame': {'duration': 200, 'redraw': False},
                 'fromcurrent': True,
                 'transition': {'duration': 200, 'easing': 'linear'}
                 }
            ],
            'label': 'Play',
            'method': 'animate'
        }]
    }]
)
layout['annotations'] = annotations

fig = go.Figure(
    data=data,
    layout=layout,
    frames=frames
)
plot(fig, filename='regression.html')

jonmmease commented 6 years ago

Hi @iandanforth , thanks for the report and the reproducible example. This will be really helpful for future performance optimization.

Construction Figure objects with lots of frames is probably always going to be a bit slower in 3.0 compared to 2.7 because of all of the extra validation and defensive copying that's happening now, but I expect we'll be able to improve this a fair amount with some profiling.

Here's what I recommend. Use objects from the graph_objs hierarchy when you're first building up the logic to construct your figure. This way you'll have the validation feedback while you're iterating on your design. Once you have it working the way you want, and you want to scale to a lot more frames, replace all of the graph_objs with plain dicts and set the validate param to plot`iplottoFalse`. This way you will bypass all property validation and things will run much faster.

When I make these changes in your example, it runs in under 2 seconds.

%%time
import numpy as np
import plotly.graph_objs as go
import colorlover as cl
from plotly.offline import plot

sim_duration = 40.0
time_inc = 0.1
motor_unit_count = 120
times = np.arange(0.0, sim_duration, time_inc)

# Setting colors for plot.
potvin_scheme = [
    'rgb(115, 0, 0)',
    'rgb(252, 33, 23)',
    'rgb(230, 185, 43)',
    'rgb(107, 211, 100)',
    'rgb(52, 211, 240)',
    'rgb(36, 81, 252)',
    'rgb(0, 6, 130)'
]
# It's hacky but also sorta cool.
c = cl.to_rgb(cl.interp(potvin_scheme, motor_unit_count))
c = [val.replace('rgb', 'rgba') for val in c]
c = [val.replace(')', ',{})') for val in c]

def get_color(trace_index: int) -> str:
    # The first and every 20th trace should be full opacity
    alpha = 0.2
    if trace_index == 0 or ((trace_index + 1) % 20 == 0):
        alpha = 1.0
    color = c[trace_index].format(alpha)
    return color

# Per Motor Unit Force
all_array = np.ones((120, len(times)))
data = []
annotations = []
anno_offsets = {
    0: 20,
    19: 30,
    39: 40,
    59: 45,
    79: 17,
    99: 56,
    119: 170
}

max_y = np.amax(all_array)
for i, t in enumerate(all_array):
    trace = dict(
        x=times[:1],
        y=t[:1],
        name=i + 1,
        marker=dict(
            color=get_color(i)
        )
    )
    data.append(trace)

frames = []
for i in range(1, len(times), 10):
    frame_data = []
    for j, t in enumerate(all_array):
        trace = dict(
            x=times[:i],
            y=t[:i],
            name=j + 1,
            marker=dict(
                color=get_color(j)
            )
        )
        frame_data.append(trace)

    frame = dict(
        data=frame_data
    )
    frames.append(frame)

layout = dict(
    title='Motor Unit Forces by Time',
    yaxis=dict(
        title='Motor unit force (relative to MU1 tetanus)',
        range=[0, max_y],
        autorange=False
    ),
    xaxis=dict(
        title='Time (s)',
        range=[0, sim_duration],
        autorange=False
    ),
    updatemenus=[{
        'type': 'buttons',
        'buttons': [{
            'args': [
                None,
                {'frame': {'duration': 200, 'redraw': False},
                 'fromcurrent': True,
                 'transition': {'duration': 200, 'easing': 'linear'}
                 }
            ],
            'label': 'Play',
            'method': 'animate'
        }]
    }]
)
layout['annotations'] = annotations

fig = dict(
    data=data,
    layout=layout,
    frames=frames
)

plot(fig, filename='regression.html', validate=False, auto_open=False)

I hope that's helpful.

iandanforth commented 6 years ago

@jonmmease Thanks for taking the time to look into this! This fix is very helpful for me. My remaining concern is one of optics. If someone tries to do much-data-many-frames-wow! and finds plotly slow it may deter them from using the library.

To address that I recommend creating a ticket to modify the animation documentation to include exactly your suggestions. I knew to use 'redraw=False' only because of those docs. (Which are very useful and well put together!)

Thanks again!

jonmmease commented 6 years ago

I'm glad this fix is working for you and thanks for the helpful suggestion regarding the documentation.

I'm going to leave this issue open until I have a chance to profile your example and see what can be done to make the situation a bit less painful.

JavascriptMick commented 6 years ago

+1 for this issue. Rather than submit another ticket, I have attached my own example of poor 3.0.0 performance. On my machine, the difference was .013 seconds to 30 seconds so ~2000 times slower. As per above I got out of trouble by extracting the data as lists of dicts using the _data property and operating on them in that form, re-creating a modified figure later.

import plotly
import plotly.plotly as py
import plotly.graph_objs as go
import plotly.figure_factory as ff

import numpy as np

plotly.tools.set_credentials_file(username='xxxxxx', api_key='xxxxx')
plotly.tools.set_config_file(world_readable=False, sharing='private')
plotly.offline.init_notebook_mode(connected=True)

z = np.random.uniform(0, 1, size=(5000, 8))
myDendrogram = ff.create_dendrogram(z, orientation='bottom')
plotly.offline.iplot(myDendrogram) 

def placeAndScaleDendrogram(figure):
    data = figure['data']

    for col in data:
        col['xaxis'] = 'x'
        col['yaxis'] = 'y2'
        col['x'] = (col['x']/10.0) + 0.5  # rescale the dendrogram to match the x axis

get_ipython().run_line_magic('prun', 'placeAndScaleDendrogram(myDendrogram)')

def placeAndScaleDendrogramRaw(figure):
    data = figure._data

    for col in data:
        col['xaxis'] = 'x'
        col['yaxis'] = 'y2'
        col['x'] = (col['x']/10.0) + 0.5  # rescale the dendrogram to match the x axis

get_ipython().run_line_magic('prun', 'placeAndScaleDendrogramRaw(myDendrogram)')

this is the performance figures for the first call

         3334357 function calls in 30.405 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    79984   20.477    0.000   20.477    0.000 basedatatypes.py:2157(<listcomp>)
    94982    3.435    0.000    3.589    0.000 basevalidators.py:251(present)
    69986    2.289    0.000   24.440    0.000 basedatatypes.py:1198(_get_child_props)

Note that most of the time is held up here.... plotly/basedatatypes.py

    @staticmethod
    def _index_is(iterable, val):
        """
        Return the index of a value in an iterable using object identity
        (not object equality as is the case for list.index)

        """
        index_list = [
            i for i, curr_val in enumerate(iterable) if curr_val is val # <<<-------HERE!
        ]
        if not index_list:
            raise ValueError('Invalid value')

        return index_list[0]

jonmmease commented 6 years ago

Thanks for the example and the profiling information @JavascriptMick, I'll see what I can come up with...

jonmmease commented 6 years ago

Alright, I've made some progress in #1061 that will be included in version 3.0.2

@iandanforth Your initial example has gone from ~110s to ~42s

@JavascriptMick The creation of the dendrogram has gone from ~110s to ~20s, and the scaling that you were timing has gone from ~20s to ~2s.

That said, these are still far slower than version 2.7 where the Figure and graph_objs objects were basically just simple dict subclassses. We're not going to be able to get all of the performance back with the new feature set, but we should be able to keep making incremental progress.

Each of your use cases has ~5000 traces in total, and for cases like these I think we'll need to encourage people to work with dict objects rather than graph_objs, and work through whether there's anything we should do in the API to make that easier.

JavascriptMick commented 6 years ago

Thanks @jonmmease. Agree, with very large traces, lists and dict are the way to go. The _data property that exposes the raw data currently support this so i guess make sure that these remain in the API.

The only other thing I can think of is providing a more elegant way of combining multiple plots.This ultimately was my goal when I discovered the issue (creating a Clustermap visualisation from a Heatmap and 2 dendrograms). All of this spinning and placing things on alternate axes feels clumsy. I understand that FigureFactories are the recommended answer to this and it is on my ToDo list to create a ClusterMap Figurefactory for this and raise a PR but It's unclear to me how much work this will be.. probably a lot.

jonmmease commented 6 years ago

Hi @JavascriptMick ,

I'd rather people not get in the habit of using the _data property for performance reasons because that will mess up the synchronization of the new FigureWidget class.

What do you think of the proposal in #1079 ? If this were implemented you could request that the figure factory return a dict, and then you could make modifications to the dict before converting it to a Figure (or even not converting it to a Figure if you use plot/iplot with validate=False).

Also, the dendrogram itself with that many points is a bit of a mess visually. I'm thinking we should expose the scipy truncate_mode property (see http://lagrange.univ-lyon1.fr/docs/scipy/0.17.1/generated/scipy.cluster.hierarchy.dendrogram.html) to have the option of limiting how many leaves are displayed.

Here's one thought for making the construction of clustermap easier (short of creating a dedicated figure factory). For figure factories that only deal with a single axis (like dendrogram), it would be nice if you could pass in a Figure object that was initialized with tools.make_subplots, along with the subplot row, col parameters. In this case the figure factory would add all of its traces to the appropriate axes of the existing figure (In practice I'd prefer to refactor the internal logic a bit and have a separate function for adding traces to an existing figure, but that's the basic idea).

arthurpham commented 5 years ago

It seems i can reproduce the same issue with this code :

import plotly
import time
import plotly.plotly as py
import plotly.figure_factory as FF

print('plot version', plotly.__version__)
t1 = time.time()

z = []
for i in range(0, 100):
    z.append([j for j in range(0, 100)])

figure = FF.create_annotated_heatmap(z)
print('Timing', time.time() - t1)

With plotly version 3.10.0, i get around 24 seconds, versus 2.9 seconds for version 2.7.0.

Is it the right place to raise the issue ?

gvwilson commented 2 months ago

Hi - we are currently trying to tidy up Plotly's public repositories to help us focus our efforts on things that will help users most. Since this issue has been sitting for several years, so I'm going to close it. If it's still a concern, we'd be grateful if you could open a new issue (with a short reproducible example if appropriate) so that we can add it to our backlog. Thanks for your help - @gvwilson

plotly / plotly.py

Performance regression in 3.0.0 #1052