VinaTsai / xgboost_notebook

0 stars 0 forks source link

Parameter Tuning in XGBoost #4

Open VinaTsai opened 3 years ago

VinaTsai commented 3 years ago

Reference: https://www.kaggle.com/general/17120

http://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/

VinaTsai commented 3 years ago

for more details about boosted trees: https://www.analyticsvidhya.com/blog/2016/02/complete-guide-parameter-tuning-gradient-boosting-gbm-python/

VinaTsai commented 3 years ago

official guidance

  1. XGBoost Parameters (official guide)
  2. XGBoost Demo Codes (xgboost GitHub repository)
  3. Python API Reference (official guide)
VinaTsai commented 3 years ago

特征变量重要性:

  1. XGBClassifier: “feature_importances” metric
  2. train: get_fscore() function
VinaTsai commented 3 years ago

调参顺序 tuning params

General Approach for Parameter Tuning We will use an approach similar to that of GBM here. The various steps to be performed are:

  1. Choose a relatively high learning rate. Generally a learning rate of 0.1 works but somewhere between 0.05 to 0.3 should work for different problems. Determine the optimum number of trees for this learning rate. XGBoost has a very useful function called as “cv” which performs cross-validation at each boosting iteration and thus returns the optimum number of trees required.
  2. Tune tree-specific parameters ( max_depth, min_child_weight, gamma, subsample, colsample_bytree) for decided learning rate and number of trees. Note that we can choose different parameters to define a tree and I’ll take up an example here.
  3. Tune regularization parameters (lambda, alpha) for xgboost which can help reduce model complexity and enhance performance.
  4. Lower the learning rate and decide the optimal parameters .

Let us look at a more detailed step by step approach.