dmlc / xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
https://xgboost.readthedocs.io/en/stable/
Apache License 2.0
26.12k stars 8.7k forks source link

Add a warning when trying to plot_tree with a gblinear booster #10819

Closed thibaut-lemarchand closed 16 minutes ago

thibaut-lemarchand commented 1 week ago

When trying to do "xgboost.plot_tree" to a booster whose learner is gblinear, I get something like the following error from graphviz :

Error: <stdin>: syntax error in line 1 near 'bias'
Warning: syntax ambiguity - badly delimited number '2.0373e' in line 34 of <stdin> splits into two tokens
Warning: syntax ambiguity - badly delimited number '4.44954e' in line 38 of <stdin> splits into two tokens
Warning: syntax ambiguity - badly delimited number '-5.16851e' in line 94 of <stdin> splits into two tokens
Warning: syntax ambiguity - badly delimited number '-1.13883e' in line 138 of <stdin> splits into two tokens
Warning: syntax ambiguity - badly delimited number '-1.35551e' in line 147 of <stdin> splits into two tokens
Warning: syntax ambiguity - badly delimited number '9.73293e' in line 148 of <stdin> splits into two tokens
etc.

It is caused by the line : Source(tree) with tree being the dump from the booster model, that isn't in the right format to be read with graphviz.

We can do something like with the trees_to_dataframe method to make it clearer :

booster = json.loads(self.save_config())["learner"]["gradient_booster"]["name"]
if booster not in {"gbtree", "dart"}:
    raise ValueError(f"This method is not defined for Booster type {booster}")
trivialfis commented 21 hours ago

Apologies for the slow reply, I opened a PR to check the dump format.