plotly / dash

Data Apps & Dashboards for Python. No JavaScript Required.
https://plotly.com/dash
MIT License
21.16k stars 2.04k forks source link

cannot pickle 'SSLContext' with background callback #2827

Open fxstempfelals opened 5 months ago

fxstempfelals commented 5 months ago

Describe your context

Windows 11, Python 3.9

dash                      2.16.1                                                                                               
dash-bootstrap-components 1.4.1                                                                                                
dash-core-components      2.0.0                                                                                                
dash-html-components      2.0.0                                                                                                
dash-table                5.0.0

Occurs with Edge & Chrome.

Describe the bug

I use a background callback in a multi-page app that throws an error when called:

Here's how the callback is defined:

BACKGROUND_CALLBACK_MANAGER = dash.DiskcacheManager(diskcache.Cache("./cache"))
@dash.callback(
    [...]
    prevent_initial_call=True,
    background=True,
    manager=BACKGROUND_CALLBACK_MANAGER,
)

The callback involves an object that has a SQLAlchemy engine as an attribute. The connection is made through SSL, so I guess this is the object that fails to be pickled. However, I can serialize this object successfully with dill.dumps, so I'm not sure...

Maybe related to https://github.com/uqfoundation/dill/issues/308, but until the issue is fixed, there might be a workaround?

Expected behavior

I expect the callback to run without error.

Screenshots

image

T4rk1n commented 5 months ago

Are you using the SQLAlchemy engine defined globally elsewhere? The multiprocess will try to transfer it when used but that is not valid pickle/multiprocess object. May need to establish the engine inside the callback for that to work or switch to celery.

fxstempfelals commented 5 months ago

The engine is not a global object but a property of an object created in the callback. Roughly:

class DBHandle:
    def __init__(self, url, schema):
        connect_args = {...}
        self.engine = sqlalchemy.create_engine(
            url,
            echo=True,
            pool_pre_ping=True,
            connect_args=connect_args,
            execution_options={"schema_translate_map": {None: schema}},
        )

@dash.callback(
    [...]
    prevent_initial_call=True,
    background=True,
    manager=BACKGROUND_CALLBACK_MANAGER,
)
def _callback():
    dbh = DBHandle(url, schema)

I'll try with Celery if I can't find a solution but it would be nice and make development easier if it could work with a local cache.

T4rk1n commented 5 months ago

Could you try running your app with:

app.run(debug=True, dev_tools_prune_errors=False)

With that option the stacktrace in the screenshot should have more info to help identify which variable is causing the issue.

fxstempfelals commented 5 months ago

Thanks for the suggestion. After some digging, I realized the SSLContext instance is a nested property of botocore.client.S3. This S3 client is not used in the background callback that is the source of the error, but elsewhere in the app. I guess dash serializes the whole execution environment?

T4rk1n commented 5 months ago

I guess dash serializes the whole execution environment?

Yes, we use multiprocess to spawn the background process and that will try to transfer all the global variables.

DJ2695 commented 3 months ago

Any updates @fxstempfelals as I encounter exactly your mentioned error, when applying a Background Callback ? I'm using py 3.11 and dash==2.17.0

fxstempfelals commented 3 months ago

@DJ2695 Nothing new on my side, I was thinking of reconsidering the whole design of my app but didn't have time yet to do so. But I'd be glad to know how you addressed this issue!