String categorical values from Lightgbm

Guidosalimbeni commented 2 years ago

Hello, great tool and library! Wonder if you can point me in the right direction to solve an issue?

with LightGbm we can list the categorical column and it is ok if the values are strings values
once in ExplainerDashboard the code crashes as it cannot handle strings values (at leat my understanding)

what can be a solution to this case? thanks

oegedijk commented 2 years ago

Hi @Guidosalimbeni,

So the if the model is able to handle categorical values then ExplainerDashboard should handle it as well. It does at least for CatBoost, so I assume it should work for lightgbm as well.

Do you have some runnable example code that shows the crash or wrong output?

oegedijk commented 2 years ago

Hi @Guidosalimbeni,

would you able to provide any examples of where this broke?

Dekermanjian commented 1 year ago

Hello, I am running into this issue. Here is the error message that I get: TypeError: '<' not supported between instances of 'float' and 'str'

And here is a reproducible example:

from lightgbm import LGBMClassifier
from sklearn.model_selection import train_test_split

from explainerdashboard import ClassifierExplainer, ExplainerDashboard
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
df[df.select_dtypes("O").columns] = df.select_dtypes("O").astype("category")
df = df[["Survived", "Age", "Sex", "Embarked"]]
y = df.pop("Survived")
X = df
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LGBMClassifier()
model.fit(X_train, y_train, feature_name='auto', categorical_feature='auto')
explainer = ClassifierExplainer(
                model, X_test, y_test,
                labels=['Not survived', 'Survived'])

db = ExplainerDashboard(explainer, title="Titanic Explainer",
                    whatif=False,
                    shap_interaction=False,
                    decision_trees=False)
db.run(port=8051)

ghost commented 1 year ago

Hello, I am running into this issue. Here is the error message that I get: TypeError: '<' not supported between instances of 'float' and 'str'

And here is a reproducible example:

from lightgbm import LGBMClassifier
from sklearn.model_selection import train_test_split

from explainerdashboard import ClassifierExplainer, ExplainerDashboard
import pandas as pd
df = pd.read_csv("https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv")
df[df.select_dtypes("O").columns] = df.select_dtypes("O").astype("category")
df = df[["Survived", "Age", "Sex", "Embarked"]]
y = df.pop("Survived")
X = df
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LGBMClassifier()
model.fit(X_train, y_train, feature_name='auto', categorical_feature='auto')
explainer = ClassifierExplainer(
                model, X_test, y_test,
                labels=['Not survived', 'Survived'])

db = ExplainerDashboard(explainer, title="Titanic Explainer",
                    whatif=False,
                    shap_interaction=False,
                    decision_trees=False)
db.run(port=8051)

I am having the same problem, is there a solution?

Guidosalimbeni commented 1 year ago

Yes great, I am still having the same issue.

galievaz commented 1 year ago

Hi,

I have the same issue due to string values in the data, I'd like to create dashboard, as a fitted model used TabularPredictor from Autogluon library is there any solution or update related to this issue?

fjpa121197 commented 1 year ago

I think this issue is more related to the data and how LightGBM is coded.

I stumble upon this error, but it was an error from LightGBM, not explainerdashboard.

Try the following:

df.columns = df.columns.str.translate("".maketrans({"[":"{", "]":"}","<":"^"}))
df.columns[df.columns.str.contains("[\[\]<]")]

This is making sure it removes and targets the error: TypeError: '<' not supported between instances of 'float' and 'str'

Do let me know if that solves your issue.

oegedijk / explainerdashboard

String categorical values from Lightgbm #198