Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
I'm running an XGBoost experiment where I'm building a tree, and evaluating it after every tree I add. However, I'm noticing that prediction takes longer as I add more trees, which makes sense, given that the tree's getting larger. Is there any way I can be calling prediction on the dataset without having to recompute the predictions of trees that I've already computed predictions for? e.g. if I have n trees in the forest already, and I've already predicted with those n trees, and cached those predictions, is there a function I can write that takes in those cached predictions and the new booster object to return the predictions of the forest of n+1 trees?
Hello,
I'm running an XGBoost experiment where I'm building a tree, and evaluating it after every tree I add. However, I'm noticing that prediction takes longer as I add more trees, which makes sense, given that the tree's getting larger. Is there any way I can be calling prediction on the dataset without having to recompute the predictions of trees that I've already computed predictions for? e.g. if I have n trees in the forest already, and I've already predicted with those n trees, and cached those predictions, is there a function I can write that takes in those cached predictions and the new booster object to return the predictions of the forest of n+1 trees?
Thanks in advance!