mmschlk / TreeSHAP-IQ

Supplement Material for research project
3 stars 1 forks source link

Implementing TreeSHAP-IQ Method #1

Open Shuva105 opened 4 days ago

Shuva105 commented 4 days ago

I am trying to use TreeShapIQ with an XGBoost model, but I keep encountering a ValueError indicating that the tree model must be either a dictionary or a TreeModel object. Despite passing both a booster object and transformed JSON-like structures, the error persists.

  1. My goal ( to use the method on my own IDS dataset which is tabular,
  2. The model --> XGBoost
  3. Here is the full traceback of the error:
    Traceback (most recent call last):
    File "/home/vm2/Documents/TreeSHAP-IQ/validate_xgboost_classification.py", line 62, in <module>
    explainer = TreeShapIQ(booster)
                ^^^^^^^^^^^^^^^^^^^
    File "/home/vm2/Documents/TreeSHAP-IQ/tree_shap_iq/base.py", line 33, in __init__
    raise ValueError("The tree model must be either a dictionary or a TreeModel object.")
    ValueError: The tree model must be either a dictionary or a TreeModel object.
  4. Steps to Reproduce: I trained an XGBoost model and obtained the booster using model.get_booster().

I attempted to pass the booster directly to TreeShapIQ as shown in the code below:

booster = model.get_booster()

# Pass the booster object to TreeShapIQ
explainer = TreeShapIQ(booster)

However, this results in the ValueError mentioned above.

I also tried dumping the booster into JSON format, transforming it to the expected dictionary structure, and passing that to TreeShapIQ, but it didn’t resolve the issue either.

  1. Additional Information: When I printed the dumped JSON model, here’s the structure I received for one of the trees:
{
    "nodeid": 0,
    "depth": 0,
    "split": "f71",
    "split_condition": -9.99999997e-07,
    "yes": 1,
    "no": 2,
    "missing": 1,
    "children": [
        {
            "nodeid": 1,
            "leaf": -0.599961817
        },
        {
            "nodeid": 2,
            "leaf": 0.599984646
        }
    ]
}

I tried converting this structure into a format that TreeShapIQ might accept (by adding children_left and children_right keys), but this did not fix the issue.

  1. Request: Could you please provide clarification on:

What exact structure does the TreeShapIQ class expect as input? Should the XGBoost booster object work directly, or do I need to transform it into a specific format before passing it? If transformations are necessary, could you provide guidance or examples of the correct format? Any help or advice would be greatly appreciated!

treeSHAP-IQ errors

mmschlk commented 4 days ago

Hi, Shuva105 thank you for your interest in TreeSHAP-IQ. To do so, you need to pass your model first to the correct converter and pass the retrieved TreeModel object to the TreeSHAP-IQ algorithm.

Having said that, this repository is not necessarily fit for conducting experiments outside the scope of the original research paper. This is why we launched the shapiq project with a follow-up paper here. shapiq is a software package that brings together a bunch of Shapley value and Shapley interaction algorithms including TreeSHAP-IQ in its TreeExplainer explainer class. If you are interested in TreeSHAP-IQ you should definitely check out the implementation there. If you encounter problems with this project, please don't hesitate to open an issue there! :)