Closed fhossfel closed 2 years ago
CartModel has a similar problem of showing only one decision but at least the mouseover is working.
display_tree({"margin": 10, "node_x_size": 160, "node_y_size": 28, "node_x_offset": 180, "node_y_offset": 33, "font_size": 10, "edge_rounding": 20, "node_padding": 2, "show_plot_bounding_box": false}, {"value": {"type": "PROBABILITY", "distribution": [0.007407407407407408, 0.7333333333333333, 0.24444444444444444, 0.014814814814814815], "num_examples": 135.0}, "condition": {"type": "CATEGORICAL_IS_IN", "attribute": "product_Group", "mask": ["DIY"]}, "children": [{"value": {"type": "PROBABILITY", "distribution": [0.0, 0.0, 1.0, 0.0], "num_examples": 33.0}}, {"value": {"type": "PROBABILITY", "distribution": [0.00980392156862745, 0.9705882352941176, 0.0, 0.0196078431372549], "num_examples": 102.0}}]}, "#tree_plot_e7010c332612435caae222c9a1230050")
Hi, I'm not sure I correctly understand the problem just yet, but let me summarize what I think is going on.
The GradientBoostedTrees model you're building has Number of trees: 1200
i.e. it consists of 1200 trees. You inspect the first tree of this collection using tfdf.model_plotter.plot_model(model, tree_idx=0, max_depth=10)
(this is what tree_idx
does). This tree alone might not be great, but this is expected - all 1200 trees together give great performance, not a single tree.
For CART, there is indeed just a single tree - but for most problems, CART models do not perform as well as Random Forests or Gradient Boosted Trees.
Ahh, okay. Did not read the manual properly and misinterpreted the tree_idx parameter.
I had noticed that the missing class distribution bars are for the gradient boosted trees. Is that intentional?
Can you please clarify what you mean with "missing class distribution bars"?
Closing this as stale
I am using tfdf 0.2.4 and can successfully train a model and plot it using the
plot_model()
function.For my current task I get a decision tree graph consisting of two decision nodes and tree outputs. The key line in the generated HTML file seems to be this one:
If I use exactly the same code but replace the
RandomForestModel
with aGradientBoostedTreesModel
I only get one decision and two outputs:This can't be right since the inferences of the
GradientBoostedTreesModel
are perfect (100% correct, thanks!) and that requires to take more features into account that the length od the classified object. AdditionallyThe model summary is below. (I have replaced some sensitive feature names). I am not really an expert but if I read the summary correctly than the decision tree should have a depth of 5 and 26 to 27 nodes. On the other hand I would have expected more noees to show for the RandomForestModel, too. ¯_(ツ)_/¯
If there is any additional information I can provide please let me know.