Closed hkoppen closed 3 years ago
Yes, each composite (base for a tab), simply adds a number to the end to self.name
, e.g.
class ImportancesComposite(ExplainerComponent):
def __init__(self, explainer, title="Feature Importances", name=None,
hide_importances=False,
hide_selector=True, **kwargs):
"""Overview tab of feature importances
Can show both permutation importances and mean absolute shap values.
Args:
explainer (Explainer): explainer object constructed with either
ClassifierExplainer() or RegressionExplainer()
title (str, optional): Title of tab or page. Defaults to
"Feature Importances".
name (str, optional): unique name to add to Component elements.
If None then random uuid is generated to make sure
it's unique. Defaults to None.
hide_importances (bool, optional): hide the ImportancesComponent
hide_selector (bool, optional): hide the post label selector.
Defaults to True.
"""
super().__init__(explainer, title, name)
self.importances = ImportancesComponent(
explainer, name=self.name+"0", hide_selector=hide_selector, **kwargs)
def layout(self):
return html.Div([
dbc.Row([
make_hideable(
dbc.Col([
self.importances.layout(),
]), hide=self.hide_importances),
], style=dict(margin=25))
])
Then ExplainerDashboard
instantiates all the tabs using ExplainerTabsLayout
, which has the line
self.tabs = [instantiate_component(tab, explainer, name=str(i+1), **kwargs) for i, tab in enumerate(tabs)]
So each tab gets the name "1", "2", 3", etc. And then each subcomponent gets the name "11", "12", etc.
Are you defining custom components? Or defining them before you add them to ExplainerDashboard
?
E.g.
tab = ImportancesComposite()
ExplainerDashboard(tab).run
Would result in a random uuid
name for tab
Hi We tried running the dashboard on a single container. That works. But running on multiple containers / swarm gives cause to the problem.
So in a swarm it starts generating uuid names but in a single container it doesn't?
That seems super strange... Again, the only thing I can think of is old versions of explainerdashboard in a cached docker layer.
I'm gonna see if I can build some diagnostic functionality that makes it easier to see the whole component tree, including .name
properties, and would also give a warning when it detects any uuid
.name
...
it also does generate uuid names with a single container. But it seems that callback names are being mixed when running on more than one container
ah, okay, that at least is an easier to understand problem. So the example I gave you didn't give uuid names right?
Is there any code you can share on how you generate the dashboard? Because you have to be doing something custom otherwise it would just work out of the box.
dashboard.yaml
dashboard:
explainerfile: data/processed/explainer.joblib
params:
title: Fastholdelses model
hide_header: false
hide_shapsummary: false
header_hide_title: false
header_hide_selector: false
block_selector_callbacks: false
pos_label: null
fluid: true
mode: dash
width: 1000
height: 800
external_stylesheets: null
# server: true
# url_base_pathname: null
responsive: true
logins: null
port: 8050
tabs:
#- importances
#- model_summary
- contributions
- whatif
- shap_dependence
#- shap_interaction
#- decision_trees
__init__.py
import logging
from pathlib import Path
from flask import Flask
from for_p_afgang_dashboard.extensions import setup_extensions
from explainerdashboard import ClassifierExplainer, ExplainerDashboard
import yaml
# Metadata for the package
# fmt: off
__version__ = "0.1.0"
__url__ = "https://lspgitlab01.alm.brand.dk/advanced-analytics/for_p_afgang_dashboard"
__description__ = "explainer dashboard for for_p_afgang model performance"
__author__ = "Niels Møller-Hansen"
__email__ = "abnimo@almbrand.dk"
# fmt: on
logger = logging.getLogger("api_logger")
file_path = Path(__file__)
def create_app(config):
logger.info("Starting app...")
logger.debug(f"Using config {config}")
app = Flask("for_p_afgang_dashboard")
app.config.from_object(config)
@app.route("/health")
def healthcheck():
return "Healthy", 200
setup_extensions(app)
dashboard_yaml_path = file_path.parent.joinpath("dashboard.yaml")
explainerfile = str(file_path.parent.joinpath("data").joinpath("explainer.joblib"))
logger.info(explainerfile)
config = yaml.safe_load(open(dashboard_yaml_path, "r"))
params = config["dashboard"]["params"]
explainer = ClassifierExplainer.from_file(explainerfile)
print("X:", len(explainer.X))
logger.info(f"Explainer contains {len(explainer.X)} samples")
dashboard = ExplainerDashboard(
explainer, server=app, url_base_pathname="/", **params
)
print(list(dashboard.app.callback_map.values()))
@app.route("/")
def return_dashboard():
return dashboard.app.index()
logger.info("Explainer dashboard loaded")
return app
Ah, I think I got it!
In the yaml I see:
tabs:
#- importances
#- model_summary
- contributions
- whatif
- shap_dependence
#- shap_interaction
#- decision_trees
So that equates to ExplainerDashboard(explainer, ["contributions", "whatif", "shap_dependence"])
.
The string tab indicators get converted by
def _convert_str_tabs(self, component):
if isinstance(component, str):
if component == 'importances':
return ImportancesTab
elif component == 'model_summary':
return ModelSummaryTab
elif component == 'contributions':
return ContributionsTab
elif component == 'whatif':
return WhatIfTab
elif component == 'shap_dependence':
return ShapDependenceTab
elif component == 'shap_interaction':
return ShapInteractionsTab
elif component == 'decision_trees':
return DecisionTreesTab
return component
These ImportancesTab
, ModelSummaryTab
, have actually been deprecated. They are only there for backward compatibility reasons: they have been deprecated in favor of ImportancesComposite
, etc, but I had not adjusted this helper method. So I will fix this in the next release, but in the meanwhile, I think if you change dashboard.yaml
to:
dashboard:
explainerfile: data/processed/explainer.joblib
params:
title: Fastholdelses model
hide_header: false
hide_shapsummary: false
header_hide_title: false
header_hide_selector: false
block_selector_callbacks: false
pos_label: null
fluid: true
mode: dash
width: 1000
height: 800
external_stylesheets: null
# server: true
# url_base_pathname: null
responsive: true
logins: null
port: 8050
importances: false
model_summary: false
shap_interaction: false
decision_trees: false
So this is equivalent of passing booleans to switch off tabs: ExplainerDashboard(explainer, importances=False, model_summary=False, shap_interaction=False, decision_trees=False)
Just released https://github.com/oegedijk/explainerdashboard/releases/tag/v0.2.20 which should fix this issue...
I think you can also simplify the loading of the dashboard:
def create_app(config):
logger.info("Starting app...")
logger.debug(f"Using config {config}")
app = Flask("for_p_afgang_dashboard")
app.config.from_object(config)
@app.route("/health")
def healthcheck():
return "Healthy", 200
setup_extensions(app)
explainerfile = str(file_path.parent.joinpath("data").joinpath("explainer.joblib"))
dashboard_yaml_path = file_path.parent.joinpath("dashboard.yaml")
logger.info(explainerfile)
dashboard = ExplainerDashboard.from_config(
explainerfile , dashboard_yaml_path, server=app, url_base_pathname="/")
logger.info(f"Explainer contains {len(dashboard.explainer)} samples")
print(list(dashboard.app.callback_map.values()))
@app.route("/")
def return_dashboard():
return dashboard.app.index()
logger.info("Explainer dashboard loaded")
return app
i updated to the latest version and also altered the .yaml file. That leaves me with this error (having touched anything else):
Traceback (most recent call last):
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/flask/cli.py", line 184, in find_app_by_string
app = call_factory(script_info, attr, args)
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/flask/cli.py", line 115, in call_factory
return app_factory(*arguments)
File "/home/niels/projektmappe/for_p_afgang_dashboard/app/for_p_afgang_dashboard/__init__.py", line 43, in create_app
explainer, server=app, url_base_pathname="/", **params
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/explainerdashboard/dashboards.py", line 465, in __init__
fluid=fluid))
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/explainerdashboard/dashboards.py", line 88, in __init__
self.tabs = [instantiate_component(tab, explainer, name=str(i+1), **kwargs) for i, tab in enumerate(tabs)]
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/explainerdashboard/dashboards.py", line 88, in <listcomp>
self.tabs = [instantiate_component(tab, explainer, name=str(i+1), **kwargs) for i, tab in enumerate(tabs)]
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/explainerdashboard/dashboard_methods.py", line 431, in instantiate_component
component = component(explainer, name=name, **kwargs)
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/explainerdashboard/dashboard_components/composites.py", line 271, in __init__
hide_selector=hide_selector, **kwargs)
File "/home/niels/.pyenv/versions/3.7.9/envs/for_p_afgang_dashboard/lib/python3.7/site-packages/explainerdashboard/dashboard_components/shap_components.py", line 1027, in __init__
if not self.explainer.onehot_cols:
AttributeError: 'XGBClassifierExplainer' object has no attribute 'onehot_cols'
ah, yeah, you have to rebuild the explainer with the new version: I made some breaking changes how categorical features and one hot encoded features are handled internally in order to support categorical features. (on the plus side: categorical features are supported now!)
Is there a reason why you are using UUIDs in the first place? Thinking you could just set seed and do randomization with numbers to get deterministic names.
E.g line 177 in dashboard_methods.py
if not hasattr(self, "name") or self.name is None: self.name = name or "uuid"+shortuuid.ShortUUID().random(length=5)
Original goal was to generate a unique name that is both short and url-friendly (planning on adding querystring support at some point). But I guess that could be done simpler and without the shortuuid dependency, e.g.: https://proinsias.github.io/til/Python-UUID-generate-random-but-reproducible-with-seed/
Got a code suggestion?
Is it working now? Shall I close the issue?
This seems to be working now! Ran with several workers on gunicorn and also saved callback id names which all matches.
Awesome!
I just tried to deploy my app on Heroku by directly importing the github project. However, I did not manage to "add the buildpack" correctly - I'm still generating a slug larger than 500MB. I did