emilhe / dash-extensions

The dash-extensions package is a collection of utility functions, syntax extensions, and Dash components that aim to improve the Dash development experience
https://www.dash-extensions.com/
MIT License
418 stars 59 forks source link

[Error] ServersideOutputTransform doesn't work as intended when running document example #215

Open DeKhaos opened 1 year ago

DeKhaos commented 1 year ago

Hi, I was trying to run the first example of ServersideOutputTransform but an error pops up.

AttributeError: 'FileSystemStore' object has no attribute '_safe_stream_open'

Screenshot: image

Also an error happens with my sample code. The idea was to reduce the loading time of Dash callback decorator when getting data from dcc.Store of big dataframe (>10MB) by using ServersideOutputTransform to keep dataframe cache on my laptop. The step was to click "Get data" -> "Show". The callback couldn't read the State of dcc.Store.

_pickle.PicklingError: args[0] from newobj args has the wrong class

Screenshot: image

My example code:

from dash import ctx,dash_table
from dash_extensions.enrich import DashProxy, Output, Input, State,\
    ServersideOutput, html, dcc, ServersideOutputTransform,FileSystemStore
from dash import DiskcacheManager
import dash_bootstrap_components as dbc
import pandas as pd
import diskcache
import time

my_backend = FileSystemStore(cache_dir="./output_cache")
cache = diskcache.Cache("./cache")
background_callback_manager = DiskcacheManager(cache)

app = DashProxy(__name__,
                suppress_callback_exceptions=True,
                external_stylesheets=[dbc.themes.CYBORG,
                                      dbc.icons.BOOTSTRAP],
                background_callback_manager=background_callback_manager,
                transforms=[ServersideOutputTransform()])
server = app.server
app.layout = html.Div(
    [dcc.Store(id="stored_data"),
     html.Br(),
     dbc.Collapse([dbc.Row([dbc.Spinner(),
                            html.P("Processing...")]),
                    ],
                  id="collapse_1",
                  is_open=False),
      dbc.Button("Show",id="show_button"),
      dbc.Button("Clear",id="clear_button"),
      dbc.Button("Get data",id="test_button"),
      dash_table.DataTable(id='result')
      ])
@app.callback(ServersideOutput("stored_data","data",backend=my_backend),
              Input("test_button","n_clicks"),
              prevent_initial_call=True)
def get_data(click):
    df = pd.read_csv('https://covid19.who.int/WHO-COVID-19-global-data.csv')
    return df

@app.callback([Output("result","data"),
               Output("result","columns")],
              [Input("show_button","n_clicks"),
               Input("clear_button","n_clicks"),
               State("stored_data","data")],
              background=True,
              running=[(Output("collapse_1","is_open"),True,False)],
              prevent_initial_call=True
              )
def show_data_test(button1,button2,df):
    if ctx.triggered_id=="show_button":
        data = df.to_dict('records')
        columns = [{"name": i, "id": i} for i in df.columns]
        return [data,columns]
    else:
        time.sleep(1)
        return [None,None]

if __name__ == '__main__':
    app.run(debug=True)
DeKhaos commented 1 year ago

Updated: The first problem was apparently on my side due to old version of flask-caching didn't have _safe_stream_open attribute.

After playing around a little, I think there was something wrong with the support of background=True from dash-extensions. Since when I commented out the background_callback_manager in DashProxy and background=True in the callback, it works fine.

It would be nice if I can have some examples on application of ServersideOutputTransform with background callbacks, especially with DiskcacheManager.

Testing environment: dash 2.6.2 dash-extension 0.1.6 flask-caching 2.0.1 python 3.9.12

emilhe commented 1 year ago

What version were you running? I have previously seen (introduced) issues with flask-caching. That's why the version is currently pinned to 2.0.0. But I am considering upgrading to 2.0.1 for better compatibiltly with new Dash versions. As I understand, you haven't seen any issued with 2.0.1?

What callback manager are you using? There are a few known compatibility fixes needed for the CeleryManager,

https://www.dash-extensions.com/getting-started/enrich#a-celerymanager

DeKhaos commented 1 year ago

What version were you running? I have previously seen (introduced) issues with flask-caching. That's why the version is currently pinned to 2.0.0. But I am considering upgrading to 2.0.1 for better compatibiltly with new Dash versions. As I understand, you haven't seen any issued with 2.0.1?

I currently see no error with flask-caching 2.0.1, the only issue I'm having is with background callback. What callback manager are you using? There are a few known compatibility fixes needed for the CeleryManager,

https://www.dash-extensions.com/getting-started/enrich#a-celerymanager

Since I don't use Redis, I try using DiskcacheManager from dash as callback manager to dump cache directly to disk. You can check out my example above.

DeKhaos commented 1 year ago

I still can't fix the error when directly setting background=True in @app.callback using DiskcacheManager as callback manager, but alternatively I found a way to work around the problem.

Using a second callback with dcc.Interval for checking the running status of show_data_test function and display the spinner if callback is still running, should be almost the same when using background=True but might be bulky if more Outputs were needed using running argument.

Fixed code:

import dash
from dash import ctx,dash_table
from dash_extensions.enrich import DashProxy, Output, Input, State,\
    ServersideOutput, html, dcc, ServersideOutputTransform,FileSystemStore
from dash import DiskcacheManager
import dash_bootstrap_components as dbc
import pandas as pd
import diskcache
import time

my_backend = FileSystemStore(cache_dir="./output_cache")

cache = diskcache.Cache("./cache")
background_callback_manager = DiskcacheManager(cache)

app = DashProxy(__name__,
                suppress_callback_exceptions=True,
                external_stylesheets=[dbc.themes.CYBORG,
                                      dbc.icons.BOOTSTRAP],
                background_callback_manager=background_callback_manager,
                transforms=[ServersideOutputTransform(backend=my_backend)])
server = app.server

app.layout = html.Div(
    [dcc.Store(id="stored_data"),
     dcc.Store(data=False,id="download_signal"),
     dcc.Store(data=False,id="show_signal"),
     dcc.Store(id="clicked_button"),
     html.Br(),
     dbc.Collapse([dbc.Row([dbc.Spinner(),
                            html.P("Processing...")]),
                    ],
                  id="collapse_1",
                  is_open=False),
      dbc.Button("Show",id="show_button"),
      dbc.Button("Clear",id="clear_button"),
      dbc.Button("Get data",id="test_button"),
      dcc.Interval(id="interval",disabled=True,interval=500),
      dash_table.DataTable(id='result')
      ])

@app.callback(ServersideOutput("stored_data","data",backend=my_backend),
              Output("download_signal","data"),
              Input("test_button","n_clicks"),
              State("stored_data","data"),
              memoize=True,
              prevent_initial_call=True)
def get_data(click,state):
    try:
        if state is None:
            df = pd.read_csv('https://covid19.who.int/WHO-COVID-19-global-data.csv')  
            return df,True
        else:
            return dash.no_update,dash.no_update
    except:
        return dash.no_update,dash.no_update

@app.callback(Output("collapse_1","is_open"),
              Output("interval","disabled"),
              Output("clicked_button","data"),
              Input("show_button","n_clicks"),
              Input("clear_button","n_clicks"),
              Input("test_button","n_clicks"),
              Input("interval","n_intervals"),
              State("show_signal","data"),
              State("download_signal","data"),
              #use to check which previous button started the dcc.Interval
              State("clicked_button","data"),
              prevent_initial_call=True)
def process_status(button1,button2,button3,interval,
                   show_signal,cached_signal,clicked_button):
    if ctx.triggered_id =="show_button":
        return True,False,"show_button"
    elif ctx.triggered_id =="clear_button":
        return True,False,"clear_button"
    elif ctx.triggered_id =="test_button":
        return True,False,"test_button"
    else:
        if clicked_button=="show_button":
            if not show_signal:
                return dash.no_update,dash.no_update,dash.no_update
            else:
                return False,True,dash.no_update
        elif clicked_button=="clear_button":
            if show_signal:
                return dash.no_update,dash.no_update,dash.no_update
            else:
                return False,True,dash.no_update
        else:
            if not cached_signal:
                return dash.no_update,dash.no_update,dash.no_update
            else:
                return False,True,dash.no_update

@app.callback([Output("result","data"),
               Output("result","columns"),
               Output("show_signal","data")],
              [Input("show_button","n_clicks"),
               Input("clear_button","n_clicks"),
               State("stored_data","data")],
               # background=True,
               # running=[(Output("collapse_1","is_open"),True,False)],
              prevent_initial_call=True
              )
def show_data_test(button1,button2,df):
    if ctx.triggered_id=="show_button":
        data = df.to_dict('records')
        columns = [{"name": i, "id": i} for i in df.columns]
        return [data,columns,True]
    else:
        time.sleep(1)
        return [None,None,False]

if __name__ == '__main__':
    app.run(debug=True)