industrial-data / predictor-explainer

AutoML and ExplainableAI for JMP (+Python)
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

Duplicate columns raises error in LightGBM #1

Closed franktoffel closed 1 year ago

franktoffel commented 1 year ago

When activating differences:

1) JMP 16 creates new Diff columns, having this behavior that has been fixed in JMP 17. 2) Python code changes the names of the columns removing characters

If the user runs pred. explainer several times, two identical columns may enter as predictors in LightGBM.

This raises an error that needs to be corrected.

JMP 17 might fix this problem, but we can also delete duplicates in Python.

As a quick workaround, run predictor explainer with differences but deactivating Python. This will give you already the differences worth including in SHAP. Then re-calculate manually as new columns in JMP and run predictor explainer with Python but without differences activated.

franktoffel commented 1 year ago

Notebook from version 22.12.11 fixes the issue.