kaz-Anova / StackNet

StackNet is a computational, scalable and analytical Meta modelling framework
MIT License
1.32k stars 344 forks source link

Question: How to tune level 2 or level 3 model? #51

Closed iFe1er closed 6 years ago

iFe1er commented 6 years ago

I know how to tune the model in level1, just put that model to first line in param.txt and rerun it once k-fold is finished.

However, i don't know how to tune the model at level2, do any one know ?

kaz-Anova commented 6 years ago

@iFe1er Apologies for late correspondence these day, I am extremely busy. I will respond to this and other queries more analytically the following days.

In principle you have 2 options:

  1. Add a third layer (e.g. a simple linear models like logistic regression for classification or linear regression for regression) to see the cv performance of the 2nd layer

  2. Because (1) is time consuming, use output_name in the main command. This will ouput 1st level predictions for train and test. You can use them to form a new dataset and redo the Stacknet with the level1 predictions as main dattaset (but you need to append the target variable yourself).

ahbon123 commented 6 years ago

Following this question, may i ask how can I choose or add more models (algorithms) for layer 2 or layer 3? Thanks.

kaz-Anova commented 6 years ago

Trial and error. There is no definite answer. However, consider this:

  1. The more models you put in the previous layer, the more you can put in the second layer. I follow a 7.5--> rule. So if you have 7 models in the first layer, you can add 1 in the second layer. If you have 15 in the first layer, you can add 2 in the second layer and so on.

  2. Whatever models you choose for the subsequent layers>1, you need to make them more modest . So if in the first layer your best tree depth was 10, it wont be surprising if in the second layer you need to put 3 or 4 depth. Similarly with other models. for linear models,. You might need higher regularisation when you run in the second level (that is because the features are more correlated with each other) . Generally, the 2nd layer models must be more simple.

ahbon123 commented 6 years ago

Thank you for your kind help. Actually I have around 20 model in first layer which covers almost all algorithms and only 1 for second layer. When I try to take one more from first layer, the results didn’t improve. So I give up and keep only one in second layer.

kaz-Anova commented 6 years ago

Will close this for now.