mckinsey / vizro

Vizro is a toolkit for creating modular data visualization applications.
https://vizro.readthedocs.io/en/stable/
Apache License 2.0
2.46k stars 109 forks source link

live update #532

Open vks2 opened 1 week ago

vks2 commented 1 week ago

Question

hello. i need a chart with auto-updating lines. it can be a ticker chart based on per-minute update. here's example how i do it in dash: https://dash.plotly.com/live-updates. is it possible not to do a custom_component to achieve the same effect? may be there's a wrapper around some plotly express graph that have similar feature? thanks much

Code/Examples

No response

Other information

No response

Which package?

None

Package version

No response

Python version

No response

OS

No response

Code of Conduct

petar-qb commented 1 week ago

Hi @vks2 and thanks for the great question. ⭐

I haven't found any quick and easy solution that can update live chart data without creating a thin custom-component wrapper around dcc.Interval or injecting dcc.Interval directly into the Vizro dashboard. Here's an example (with some explanation in the comments) with a custom component that wraps the dcc.Interval component really thin, (the data in this example is updated every 2 seconds):

"""Example to show dashboard configuration."""

from typing import Literal

import vizro.models as vm
import vizro.plotly.express as px
from dash import Input, Output, callback, dcc
from vizro import Vizro
from vizro.managers import data_manager

# This function is used to get live data. You can implement your own function to get data from a database or API.
def get_live_data():
    """Get live data."""
    return px.data.gapminder().sample(30)

# Beware of assigning the function without calling it -> `get_live_data` instead of `get_live_data()`.
data_manager["live_data"] = get_live_data

# Custom MyInterval component
class MyInterval(vm.VizroBaseModel):
    type: Literal["my_interval"] = "my_interval"

    id: str
    interval: int

    def build(self):
        return dcc.Interval(id=self.id, interval=self.interval, n_intervals=0)

# Explicitly enable MyInterval to be used in the Vizro Page.
vm.Page.add_type("components", MyInterval)

page = vm.Page(
    title="Vizro live data update",
    components=[
        vm.Graph(figure=px.scatter(data_frame="live_data", x="gdpPercap", y="lifeExp", color="continent")),
        MyInterval(id="my_interval", interval=1000 * 2),  # Refresh data every 2 seconds
    ],
    controls=[vm.Filter(column="continent"), vm.Filter(column="year")],
)

dashboard = vm.Dashboard(pages=[page])

# Callbacks <---------------------------------------------------------------------

# "on_page_load_action_trigger_Vizro live data update" is the id of the component that triggers the page loading mechanism.
# Vizro is going to release a predefined actions to do this, but for now, we need to use this "hack".
@callback(
    Output("on_page_load_action_trigger_Vizro live data update", "data"),
    Input("my_interval", "n_intervals"),
)
def update_data(n_intervals):
    """Update data."""
    return n_intervals

if __name__ == "__main__":
    Vizro().build(dashboard).run()

https://github.com/mckinsey/vizro/assets/108530920/f9da9b52-f82c-4c57-8dbf-5870bbde8f1c

If creating a custom component doesn't work for you, you can always add the dcc.Interval directly to the dashboard in a following way (as already described in https://github.com/mckinsey/vizro/issues/533#issuecomment-2175536011):

app = Vizro().build(dashboard)
app.dash.layout.children.append(dcc.Interval(id="my_interval", interval=2000, n_intervals=0))
app.run()

Also, if manually refreshing the page to fetch the updated data works for you (instead of updating live using the dcc.Interval), you can remove the callback along with the dcc.Interval component from the example and everything will work as expected.

You can find more about dynamic data in the Vizro -> here

I hope this could help you build your desired application. 🤞

vks2 commented 1 week ago

indeed good and working example. thank you. as far as refreshing is concerned, if i inject another vm.Graph with static data inside - we are sure that not overall chart will be refreshed? encountered that effect in other libs. thanks much. (i'm talking about that code line:

page = vm.Page(
    title="Update the chart on page refresh",
    components=[
        vm.Graph(figure=px.box("iris", x="species", y="petal_width", color="species")) 

    ],
)
antonymilne commented 1 week ago

If you don't mind, let me just hijack your question briefly to do a bit of user research because this is a feature that's on our roadmap to built in natively to Vizro but we haven't figured out the right API for it yet 😀

As a user, how would you expect to enable this feature? Currently, as in @petar-qb's example above, you enable dynamic data like this:

# This function is used to get live data. You can implement your own function to get data from a database or API.
def get_live_data():
    """Get live data."""
    return px.data.gapminder().sample(30)

# Beware of assigning the function without calling it -> `get_live_data` instead of `get_live_data()`.
data_manager["live_data"] = get_live_data

And then use it like this:

vm.Graph(figure=px.scatter(data_frame="live_data", x="gdpPercap", y="lifeExp", color="continent"))

It's also possible to setup a cache that times out like this:

data_manager["live_data"] = get_live_data
data_manager["live_data"].timeout = 10

This timeout does not actually trigger a refresh of the data in any way, it just controls expiry of a server side cache (see https://vizro.readthedocs.io/en/stable/pages/user-guides/data/#set-timeouts).

Now if you wanted to enable live updates, how would you expect to do that? Would you like to be able to set different graphs on the page to update at different rates? Or refresh the whole page every N seconds? Where would you expect this setting to live? e.g. it could be associated with data_manager or with vm.Graph or with vm.Page? Is there anything else you'd like to configure here beyond the refresh interval?

vks2 commented 1 week ago

i'm sorry of being a maximalist here but the optimal way is mimic the ipyvizzu lib here. look at the beauty of charts updates there. being a realist, it would be good if we set up a timeframe arg in vm.Graph constructor and do not affect all the page, because of complex datasources associated with it. PS please note that we indeed get our data from pd.psql or pd.read_html or pd.read_json (third party APIs or own dwh)

antonymilne commented 1 week ago

Got it, thank you! I'd never seen ipyvizzu before but those are indeed very beautiful chart updates!

Realistically indeed I think we will never get to something quite like that because Vizro plots are all based on plotly, which does support some animation but it's definitely not designed to do that natively as sleekly as ipyvizzu does.

So what you'll find with @petar-qb's above example is that the interval triggers a refresh of the whole page. Actually it's not exactly a full page refresh but equivalent to what happens if you click on a different page on the navbar, which in one sense a much lighter request than doing a full page refresh. In practice this means that when the refresh is triggered, all graphs on the page undergo the same process:

  1. fetch data
  2. apply parameters/filters from the control panel
  3. draw whole graph again from scratch

Typically it is stage 1 here is that is the expensive one (like with your complex data sources) that you don't wish to rerun, and this is where data caching can really help you.

It's actually probably not too hard to come up with some ways to optimise steps 2 and 3 above and also to not trigger a refresh of all the graphs on the screen (especially if you are willing to compromise on other things, e.g. you don't want your static graph to respond to filters). But definitely the simplest way right now is just to let the whole page refresh and rely on caching to avoid repeated heavy data loading. This should work pretty much as well as just refreshing only the graphs that you want, just the static graph will flicker on page refresh even though it doesn't change.

antonymilne commented 1 week ago

I'd be very interested in what sort of page you're trying to make here btw if you don't mind sharing a bit more. e.g. would you ultimately like lots of vm.Graph on one page, each with its one separate refresh_time value? Or it's really just you want a single graph to refresh on the page and all others would remain static? Would these graphs all share the same data source in your case or come from different places? What sort of timeframes are relevant for you? Like updating once per second or once per hour?

If we can understand the use case a bit better then it will help inform how we might implement the solution when it comes to developing the feature 🙂