Closed ahbon123 closed 7 years ago
That is right. So you put the model you want to tune on top and then a dummy (it does not matter what you put ) as a level 2 model. Than you run that command and you see the cross validation score. Then you terminate the process and you change a parameter - for example from colsample_bylevel:1.0
to colsample_bylevel:0.9
and you re-run the same process (e.g the same command) and you take note of performance. You keep doing this until no improvement, then you witch to another parameter - for example subsample
.
This process is briefly explained in the last 2 minutes of this video if it helps...
Another question, when i tune one model, i should focus on the score of Average of all folds model 0 in red circle, right? I try to tune xgboost, tuning n_estimator from 500 to 2000, the score changed from 0.05328404590186987 to 0.05310419857868738, does that mean performance get better? 2000 seems too large...
Hi @ahbon123, yes performance gets better, for Mean Absolute Error the closer to 0 the better. 2000 rounds for a booster is not that large. Good luck ;-)
@ahbon123 . Yes. You could see the value you highlighted. When I am in a hurry, I only look the first 2 or folds and then I terminate the process.
Thanks for your prompt reply. Did you try with K-fold CV in which K is more than 4? I tries with 5, the predictions seems get worse score, so i stop there and don't try with 10, maybe 4 is just good.
normally 4 or 5 is what I use. I dont expect much difference between the 2.
Hi Marios,
Thanks for sharing Stacknet, great tool for stacking method, but still i'm not clear how to tune a single model, for example, if my paramter file is as following:
__XgboostRegressor booster:gblinear objective:reg:linear max_leaves:0 num_round:500 eta:0.1 threads:3 gamma:1 max_depth:4 colsample_bylevel:1.0 min_child_weight:4.0 max_delta_step:0.0 subsample:0.8 colsample_bytree:0.5 scale_posweight:1.0 alpha:10.0 lambda:1.0 seed:1 verbose:false
_LightgbmRegressor verbose:false__
What should i put in the command line? Is it same as this one?
_java -Xmx12048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=datasettwo model=model2 pred_file=pred2.csv train_file=dataset2_train.txt test_file=dataset2_test.txt test_target=false params=dataset2params.txt verbose=true threads=1 metric=mae stackdata=false seed=1 folds=4 bins=3
Thank you!