Battle plan to gain 0.003

glouppe commented 11 years ago

[x] add stat features into the stacker, computed over the predictions of the individual models (e.g., mean, median, max, min)
[x] add new stack files: I would encourage model diversity here, in order for the stacker to correct for the errors of each other.

pprett commented 11 years ago

Feature importance of our current stacker

fx_imp_stacker

seems like one stacker fx is redundant (copy & paste error?)

pprett commented 11 years ago

gilles, whats your current best 3-fold CV score for the stacker; I've

0.977182795388 {'min_samples_split': 21, 'n_estimators': 500, 'learning_rate': 0.02, 'max_depth': 5, 'subsample': 1.0}

glouppe commented 11 years ago

Yeah, thanks for the typo! One of them should be gbrt-500-old-%d-%s.txt.

Hmm, my best was around 0.9768 if I remember correctly. I did not save the setting though :s

I'll run a new grid around your setting including some stat features.

pprett commented 11 years ago

when I include gbrt-500-old my scores drop from 0.9771 to 0.9768

pprett commented 11 years ago

pdp_magic_fx

here's the partial depencence of the magic feature

glouppe commented 11 years ago

Grid-search is over. The best stacker found is {'max_features': 12, 'min_samples_split': 25, 'learning_rate': 0.023333333333333331, 'n_estimators': 500, 'max_depth': 5}, which scores 0.9775 in validation, and 0.97769 on the leaderboard. It includes stat features, but this do not seem to improve (our current best score on LB does not include them...) I'll remove them and try to add the my-mfcc features.

pprett commented 11 years ago

ah - forgot that those will be automatically synced :-)

let me rename them first - they are mfcc features (+ a bunch more from yaafe) and a gbrt w/ the same config as our current best gbrt on the RANLP features. It does not seem to help - in fact my scores get worse... I think we should try to do feature selection on the stacking features.

2013/3/26 Gilles Louppe notifications@github.com

Grid-search is over. The best stacker found is {'max_features': 12, 'min_samples_split': 25, 'learning_rate': 0.023333333333333331, 'n_estimators': 500, 'max_depth': 5}, which scores 0.9775 in validation, and 0.97769 on the leaderboard. It includes stat features, but this do not seem to improve (our current best score on LB does not include them...) I'll remove them and try to add the my-mfcc features.

— Reply to this email directly or view it on GitHubhttps://github.com/glouppe/whale-challenge/issues/11#issuecomment-15444905 .

Peter Prettenhofer

glouppe commented 11 years ago

I think we should try to do feature selection on the stacking features.

Yes I agree, just like it improved for the "base" models.

glouppe commented 11 years ago

Current best stacker = {'max_features': 11, 'min_samples_split': 29, 'learning_rate': 0.012444444444444444, 'n_estimators': 1000, 'max_depth': 5}. It scored 0.977625019906 in (shuffled) 3-cv and 0.97785 on LB (improved our best score). It includes all our stacks (my-mfcc and knn as well).

glouppe commented 11 years ago

I computed the feature importances of the stacker, it outputs:

0.21619027  magic
0.05967051  adaboost-500
0.04328541  rf-1000
0.04017739  et-500
0.03638481  dbn-500-500-250
0.09748533  dbn-2000
0.05407607  my-mfcc
0.01044681  knn
0.04772712  dbn-1200-**
0.07849664  gbrt-500-**
0.05181329  gbrt-500-old-**
0.15296886  gbrt-2500-**
0.11127749  gbrt-2500-old-**

The importances seem quite different from your previous plot (?).

glouppe commented 11 years ago

After optimizing subsample, the current best stacker is {'subsample': 0.9375, 'learning_rate': 0.012444444444444444, 'n_estimators': 1000, 'min_samples_split': 29, 'max_features': 11, 'max_depth': 5}.

It scores 0.977693013266 in validation and 0.97799 on LB when repeated 50 times.

Still 0.00259 to go!

pprett commented 11 years ago

I tried oversampling the minority class for the stacker - did not improve performance (as for the original GBRT model)

Will try RankSVM as stacker but I don't think it will be better than our GBRT stacker.

I also tried another DBN - starting from your best configuration - strange thing is: when I only use the 16Hz features (mfcc, ceps, specs) I get miserable results...

glouppe commented 11 years ago

Adding the magic feature definitely helped. GBRT alone with the magic feature scored 0.97394 on LB (1 single instance, ['2200', 'gbrt', '2500', '8', '0.125', '0.01', '26', '1.0', 'gbrt-2500-xx']). In comparison, the same setting without the magic feature yields 0.97258 on LB.

I am waiting for the other instances to finish (20 are running). Then I'll add them into the stack.

glouppe / kaggle-marinexplore

Battle plan to gain 0.003 #11