Closed nyllmak closed 1 year ago
If both policies build the same tree with a small dataset, this result is possible.
@trivialfis The actual dataset I had his issue on has a many models each with few hundred features and 15k unique training points. Its also quite responsive in terms of how it reacts to feature and parameter changes. The maxes are set at 20 levels deep, and 900 estimators max (it uses all 900). That's why its surprising to me that the numbers come out the exactly the same for each model in the whole ensemble when changing grow_policy.
apologies, posted the wrong code.
Tried a few examples, and the predictions are the same indeed. But the explanation is just they really build the same tree regardless of the policy being used, only with the tree nodes reordered. Both policies build trees until there's no new gain can be obtained by splitting leave, in which case, the final trees are the same.
Based on this, one can make different trees if they restrict the number of leaves (which can be useful for regularization):
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
import xgboost as xgb
import numpy as np
import json
def main() -> None:
X, y = make_regression(n_samples=2048 * 128, n_features=128, random_state=13)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=13)
n_estimators = 128
reg = xgb.XGBRegressor(
n_estimators=n_estimators,
tree_method="hist",
grow_policy="depthwise",
max_leaves=12,
)
reg.fit(X_train, y_train)
predt_dw = reg.predict(X_test)
reg = xgb.XGBRegressor(
n_estimators=n_estimators,
tree_method="hist",
grow_policy="lossguide",
max_leaves=12,
)
reg.fit(X_train, y_train)
predt_lg = reg.predict(X_test)
std = np.std(predt_dw - predt_lg)
print(std, np.allclose(predt_dw, predt_lg))
# 32.64627 False
if __name__ == "__main__":
main()
@trivialfis Thanks for the pointers. I guess I was being a bit too generous with my allowances for how complex is allowed to become.
I'll mark the issue as resolved and try to be more aggressive with my restrictions on the tree growth since it seems like use error on my end. I was expecting there to be some more divergence between the methods, but I suppose that it is no the case.
Thanks for the help.
I've been messing around with different fits for my models. I tired to experiment with using lossguide for the split decisions instead of depthwise. I noticed that I'm getting the same exact predictions and fit for both of the options.
I'm using the SkLearn interface, but according to the docs it seems like whatever I am doing shouldn't be an issue. I verified that the 'grow_policy' parameter is getting saved within the JSON of the model. I presume that it points to the parameter being set and recognized (to some extent).
I've reproduced this on my end on XGB 1.6.1 and 1.7.5 using 'exact', 'approx', 'hist', and 'gpu_hist' getting the same result each time. I tried unbounding the leaves/depth but also got the same results. I would expect that unbounding depth would maybe resolve the issue by allowing lossguide to grow, but it seem like that isn't the case.
I am using a fixed seed for these test, but I don't see a reason for why that should cause an issue since the algorithms should still behave differently (correct me if I'm wrong on this).
Here is the minimum code that produced the issue for me: