parrt / dtreeviz

A python library for decision tree visualization and model interpretation.
MIT License
2.94k stars 331 forks source link

Color keyword argument - Value error #288

Open Seeth92 opened 1 year ago

Seeth92 commented 1 year ago

I am working on a binary classification problem using lightGBM. The model was trained on 42 features. The training dataset size is (78000, 42) - 78000 observations spanning across 42 features The test dataset size is (25220, 42)

Using dtreeviz on my trained model:

viz = dtreeviz.model(gm, tree_index = 0, X_train = X_train, y_train=Y_train, feature_names = features, target_name="A", class_names = ["A", "B"])

When I execute viz.view() I am facing the following error: ValueError: The 'color' keyword argument must have one color per dataset, but 1 datasets and 0 colors were provided

Any thoughts on how to go about this?

tlapusan commented 1 year ago

Is it something similar with https://github.com/parrt/dtreeviz/issues/280?

baligoyem commented 1 year ago

I am facing the same error.

Here is the image which contains some details:

image

Seeth92 commented 1 year ago

@tlapusan Yes, the error description is the same as the one posted by @baligoyem

baligoyem commented 1 year ago

Have you ever faced the AttributeError, which its description is 'Rectangle' object has no attribute 'patches'?

I am asking this question because I have sometimes randomly faced these two errors that are related to each other, I believe.

tlapusan commented 1 year ago

did you try the latest version of dtreeviz ?

baligoyem commented 1 year ago

yes, I did. But it did not resolve.

Seeth92 commented 1 year ago

Using colour-0.1.5 and dtreeviz-2.2.1 .. No luck at all

tlapusan commented 1 year ago

could you provide a google collab or any kind of shareable notebook so I could reproduce your issue?

Seeth92 commented 1 year ago

@tlapusan Sorry for responding this late. Unfortunately, I can't share the notebook as the data and features used is confidential :(

Guido-Hwang commented 1 year ago

+1

Thegongyx commented 11 months ago

+1

leonswl commented 10 months ago

+1

andylokandy commented 8 months ago

+1

windyd commented 6 months ago

In my case (dtreeviz=2.2.2), it seems to be a precision problem from the get_thresholds method. If you have small float thresholds, samples are assigned to wrong paths. In some cases, some nodes may end up with no samples.

class ShadowLightGBMTree(ShadowDecTree):
    ...
    def get_thresholds(self) -> np.ndarray:
        if self.thresholds is not None:
            return self.thresholds

        node_thresholds = [-1] * self.nnodes()
        for i in range(self.nnodes()):
            if self.children_left[i] != -1 and self.children_right[i] != -1:
                if self.is_categorical_split(i):
                    node_thresholds[i] = list(map(int, self.tree_nodes[i]["threshold"].split("||")))
                else:
                    ###  thresholds are ROUNDED!
                    node_thresholds[i] = round(self.tree_nodes[i]["threshold"], 2)

        self.thresholds = np.array(node_thresholds, dtype=object)
        return self.thresholds

No sample -> No color mapped -> this problem.

leocwolter commented 4 months ago

+1 on 2.2.2, any workarounds?

CaoHaiNam commented 3 months ago

I dealt with this issue. You should ensure that the data you use to train the LGBM model is the same as the data for visualization.