zauberzeug / nicegui

Create web-based user interfaces with Python. The nice way.
https://nicegui.io
MIT License
8.58k stars 516 forks source link

Updating Ploty chart with a large dataset causes the UI to hang #3340

Open TsaiTung-Chen opened 1 month ago

TsaiTung-Chen commented 1 month ago

Description

I'm trying to plot some audio data with a sample rate of 48000 Hz. Updating the Plotly chart with the large dataset causes the whole NiceGUI to hang. I'm not quite sure how to reproduce this issue because same code sometimes works but sometimes does not.

Minimal code

Here is my minimal code, hope someone knows what's going on.

import random
import asyncio
from nicegui import ui, native
import plotly.graph_objects as go

fig = go.Figure(
    data={"type": 'scatter'},
    layout={"margin": dict(l=20, r=20, t=20, b=20)}
)
plot = ui.plotly(fig)

async def update_data():
    update_bt.props('loading')  # update button
    try:
        # Clear data
        fig.update_traces(x=[], y=[])
        plot.update()

        await asyncio.sleep(1)  # refresh UI

        # Update plot data
        fig.update_traces(x=x, y=y)
        plot.update()
    finally:
        update_bt.props(remove='loading')  # update button
        print('The Plotly chart should have been refreshed')  # the ploty chart does not be refreshed after this line

update_bt = ui.button('Update data', on_click=update_data)
ui.button('Force reload', on_click=ui.navigate.reload)  # force reload always works

# 50-sec audio data with 48000 Hz sample rate
x = [ float(n) for n in range(48000*50) ]  # using numpy arrays might avoid ui hanging
y = [ random.uniform(-i, i) for i in x ]  # using numpy arrays might avoid ui hanging

ui.run(port=native.find_open_port(), native=True)

I've added a force reload button at the bottom. If the plotly chart does not be refreshed after the update_data returns, clicking the force reload button always refreshes all UI elements.

I've tried using dict instead of plotly.graph_objects and the results are same.

My hardware

M2 Pro Mac Mini macOS 14.5

Python packages

python 3.10 nicegui==1.4.29 pywebview==5.1 plotly==5.22.0

Thank you for any help you can provide.

falkoschindler commented 1 month ago

Hi @TsaiTung-Chen,

A number of 48000 * 5 datapoints results in a message of 60MB, which causes the browser to struggle quite a bit. Have you tried creating a plain Plotly example (without NiceGUI) with that many points? Does it render more smoothly?

falkoschindler commented 1 month ago

I just tested with the following HTML code. It is killing the browser. 😄

<html>
  <head>
    <script src="https://cdn.plot.ly/plotly-latest.min.js"></script>
  </head>
  <body>
    <div id="plot"></div>
    <script>
      const n = 48000 * 50;
      let x = new Array(n);
      let y = new Array(n);
      for (let i = 0; i < n; i++) {
        x[i] = Math.random();
        y[i] = Math.random();
      }
      Plotly.newPlot("plot", [{ x: x, y: y, mode: "markers", type: "scatter" }]);
    </script>
  </body>
</html>
TsaiTung-Chen commented 1 month ago

Hi @falkoschindler,

Thank you for your immediate reply. I know a dataset of 48000 * 50 samples is really large. But updating the chart with a pair of numpy arrays instead of lists with the same number of samples works as expected (does not cause all UI elements to hang). In addition, even if I use a pair of lists, after clicking the force reload button to refresh all UI elements, the loading button and stuck chart relive. So can I say that this framework should have the ability to handle this large dataset but refreshing just does not work properly?

Regarding the plain Plotly

I've tried with the following code, and it works well.

import random
import plotly.graph_objects as go

x = [ float(i) for i in range(48000*50) ]
y = [ random.uniform(-i, i) for i in range(48000*50) ]

fig = go.Figure(go.Scatter(x=x, y=x))
fig.show()
python-and-fiction commented 1 month ago

pandas is better than list.

import random
import asyncio
import pandas as pd
from nicegui import ui, native
import plotly.graph_objects as go

data = pd.DataFrame({
    'x':[ float(n) for n in range(48000*50)],
    'y':[ random.uniform(-i, i) for i in range(48000*50) ]
    })
x = data['x']  # using numpy arrays might avoid ui hanging
y = data['y'] # using numpy arrays might avoid ui hanging
fig = go.Figure(
    data={"type": 'scatter'},
    layout={"margin": dict(l=20, r=20, t=20, b=20)}
)
plot = ui.plotly(fig)

async def update_data():
    update_bt.props('loading')  # update button
    try:
        # Clear data
        fig.update_traces(x=[], y=[])
        plot.update()

        await asyncio.sleep(1)  # refresh UI

        # Update plot data
        fig.update_traces(x=x, y=y)
        plot.update()
    finally:
        update_bt.props(remove='loading')  # update button
        print('The Plotly chart should have been refreshed')  # the ploty chart does not be refreshed after this line

update_bt = ui.button('Update data', on_click=update_data)
ui.button('Force reload', on_click=ui.navigate.reload)  # force reload always works

ui.run(port=native.find_open_port(), native=True)
falkoschindler commented 1 month ago

Ok, I think I finally understand:

I tested with the following code:

import asyncio
import numpy as np
import plotly.graph_objects as go
from nicegui import ui

fig = go.Figure(data={'type': 'scatter'})
plot = ui.plotly(fig)

async def update_data():
    fig.update_traces(x=[], y=[])
    plot.update()
    await asyncio.sleep(0.1)
    x = np.arange(48000*int(n.value))
    y = np.random.uniform(-100, 100, 48000*int(n.value))
    fig.update_traces(x=x.tolist(), y=y.tolist())
    # fig.update_traces(x=x, y=y)
    plot.update()

n = ui.number(value=50)
ui.button('Update data', on_click=update_data)

ui.run()

Does anyone have an idea what is going on?

mohsenminaei commented 1 month ago

@falkoschindler I have an observation that may be related to this. In the code below, if you put n_samples equal to 1000, the click event works fine. If you put it equal to 100000, the ui crashes and loses connection. My guess is that ui.plotly needs work in handling data transfer.

n_samples = 1000
data = numpy.random.rand(n_samples)
histogram = go.Figure(data=[go.Histogram(x=data)])
histogram_ui = ui.plotly(histogram)
histogram_ui.on('plotly_click', ui.notify)
falkoschindler commented 3 weeks ago

@mohsenminaei Using your code, I can't reproduce a problem with 100_000 samples. Even 1_000_000 samples are displayed after loading for a few seconds.

mohsenminaei commented 3 weeks ago

@falkoschindler Sorry I was not clear in my response. As you said the data samples are displayed with no problem. The crash happens when I start clicking on the chart. With 100k data, usually after clicking 5-6 times the app crashes. I recorded my page in the video below with this code (when I move mouse cursor in a zig zag way, I'm clicking on the chart):

import numpy as np
from nicegui import ui
from plotly import graph_objects as go

n_samples = 100000
data = np.random.rand(n_samples)
histogram = go.Figure(data=[go.Histogram(x=data)])
histogram_ui = ui.plotly(histogram)
histogram_ui.on('plotly_click', ui.notify)

ui.run()

https://github.com/user-attachments/assets/b0b869d3-72d7-4b55-9515-71a67763a84f

krashdifferent commented 3 weeks ago

@falkoschindler - An issue I've been experiencing with plotly events on nicegui just brought me here and may be related. The returned event object contains a complete copy of all of the data in the plot for each point returned, at least for clicking and hover events. If this complete copy is serialized to text from the JavaScript side maybe that is the hang up for large data sets? Selection events that contain multiple points appear even worse since each point selected contains its own complete copy of the data in the plot giving n x m scaling for the serialized object size. The only exploration I've done to date is logging to the console to inspect the returned object (for example, replacing ui.notify with lambda x: print(x.args) in @mohsenminaei 's code above).

falkoschindler commented 3 weeks ago

@mohsenminaei @krashdifferent Yes, point click events can cause a huge amount of data to be sent back and forth. You can limit it by specifying the event arguments like so:

histogram_ui.on('plotly_click', ui.notify, args=['event'])

I think the original issue is unrelated to user events. It's still unclear to me why updating traces with lists rather than NumPy arrays causes the UI to hang.