What is a typical way to use StackNet?

ajing commented 7 years ago

I am a Kaggler and try to improve myself. :). Thanks for the great tool!

I saw you have cases combining two StackNets. So, I am wondering the typical strategy to use StackNet. After some data cleaning and feature engineering, then run StackNet. How to do model diagnosis with the result of StackNet? How to gradually improve the final model?

Thanks, Jing

ajing commented 7 years ago

How to deal with missing values in StackNet?

goldentom42 commented 7 years ago

Hi ajing, for missing values I don't think StackNet is expected to deal with them unless you exclusively use XGBoost or LGBM, which deal with them out of the box. You would need to treat them in your favorite ML language before using StackNet.

For the typical strategy I assume you need to clean your data and perform feature extraction before using StackNet.

I am not sure what you mean by model diagnosis but if there is some sort of plotting involved you will have to do it in your favorite language and use output_name parameter so that you get the predictions of each model and fold saved by StackNet.

Hope this helps, Olivier.

ajing commented 7 years ago

Thanks, Oliver!

kaz-Anova commented 7 years ago

If apologies for late response @ajing .

If you use sparse format, most StackNet-native algorithms treat the non-given values as zeros . As @goldentom42 pointed out , certain algorithms will treat these differently.

For the meantime, you are solely responsible for creating good features/feature_engineering , but in the future there will be options inside StackNet too.

A good approach to build a strong StackNet is explained here after the How to use StackNet . You need to build it model-by-model as in hyper tune one model at a time and sequentially you build your ensemble.

Consider having various different datasets and run different STackNets . For example in one dataset you might one-hot encode your categorical variables , while in another you may just label encode them.

Then you can average all eeh results.

Hope it helps.

ajing commented 7 years ago

Hi Marios,

Thanks for answering my question in such detail!

You mentioned: "To tune a single model, one may choose an algorithm for the first layer and a dummy one for the second layer." How dumb should the second layer be?

Thanks, Jing

goldentom42 commented 7 years ago

Hello Jing,

The goal in this step is to tune a single model so a linear regression / Logistic regression would do. When I tune a model I usually set several versions of hyperparameters on the first level and a linear model or a random forest with small depth at the 2nd level . This way I can see how the model I want to tune behaves against the dataset.

Here is an example when searching for the best regularization parameter:

`LSVR Type:Liblinear threads:1 usescale:True C:1.0 maxim_Iteration:1000 seed:1 verbose:false LSVR Type:Liblinear threads:1 usescale:True C:0.1 maxim_Iteration:1000 seed:1 verbose:false LSVR Type:Liblinear threads:1 usescale:True C:0.01 maxim_Iteration:1000 seed:1 verbose:false LSVR Type:Liblinear threads:1 usescale:True C:0.001 maxim_Iteration:1000 seed:1 verbose:false

RandomForestRegressor bootsrap:false estimators:100 threads:3 offset:0.00001 max_depth:5 max_features:0.3 min_leaf:1.0 min_split:5.0 Objective:RMSE row_subsample:0.8 seed:1 verbose:false`

Let me know if this answers your question. Olivier

ajing commented 7 years ago

Thanks for the explanation, Oliver!

Could I interpret it as another way to do GridSearch in CV?

goldentom42 commented 7 years ago

Sure you can ;-)

kaz-Anova / StackNet

What is a typical way to use StackNet? #15