Open glouppe opened 11 years ago
Feature importance of our current stacker
seems like one stacker fx is redundant (copy & paste error?)
gilles, whats your current best 3-fold CV score for the stacker; I've
0.977182795388 {'min_samples_split': 21, 'n_estimators': 500, 'learning_rate': 0.02, 'max_depth': 5, 'subsample': 1.0}
Yeah, thanks for the typo! One of them should be gbrt-500-old-%d-%s.txt
.
Hmm, my best was around 0.9768 if I remember correctly. I did not save the setting though :s
I'll run a new grid around your setting including some stat features.
when I include gbrt-500-old my scores drop from 0.9771 to 0.9768
here's the partial depencence of the magic feature
Grid-search is over. The best stacker found is {'max_features': 12, 'min_samples_split': 25, 'learning_rate': 0.023333333333333331, 'n_estimators': 500, 'max_depth': 5}
, which scores 0.9775 in validation, and 0.97769 on the leaderboard. It includes stat features, but this do not seem to improve (our current best score on LB does not include them...) I'll remove them and try to add the my-mfcc features.
ah - forgot that those will be automatically synced :-)
let me rename them first - they are mfcc features (+ a bunch more from yaafe) and a gbrt w/ the same config as our current best gbrt on the RANLP features. It does not seem to help - in fact my scores get worse... I think we should try to do feature selection on the stacking features.
2013/3/26 Gilles Louppe notifications@github.com
Grid-search is over. The best stacker found is {'max_features': 12, 'min_samples_split': 25, 'learning_rate': 0.023333333333333331, 'n_estimators': 500, 'max_depth': 5}, which scores 0.9775 in validation, and 0.97769 on the leaderboard. It includes stat features, but this do not seem to improve (our current best score on LB does not include them...) I'll remove them and try to add the my-mfcc features.
— Reply to this email directly or view it on GitHubhttps://github.com/glouppe/whale-challenge/issues/11#issuecomment-15444905 .
Peter Prettenhofer
I think we should try to do feature selection on the stacking features.
Yes I agree, just like it improved for the "base" models.
Current best stacker = {'max_features': 11, 'min_samples_split': 29, 'learning_rate': 0.012444444444444444, 'n_estimators': 1000, 'max_depth': 5}
. It scored 0.977625019906 in (shuffled) 3-cv and 0.97785 on LB (improved our best score). It includes all our stacks (my-mfcc and knn as well).
I computed the feature importances of the stacker, it outputs:
0.21619027 magic
0.05967051 adaboost-500
0.04328541 rf-1000
0.04017739 et-500
0.03638481 dbn-500-500-250
0.09748533 dbn-2000
0.05407607 my-mfcc
0.01044681 knn
0.04772712 dbn-1200-**
0.07849664 gbrt-500-**
0.05181329 gbrt-500-old-**
0.15296886 gbrt-2500-**
0.11127749 gbrt-2500-old-**
The importances seem quite different from your previous plot (?).
After optimizing subsample
, the current best stacker is {'subsample': 0.9375, 'learning_rate': 0.012444444444444444, 'n_estimators': 1000, 'min_samples_split': 29, 'max_features': 11, 'max_depth': 5}
.
It scores 0.977693013266 in validation and 0.97799 on LB when repeated 50 times.
Still 0.00259 to go!
I tried oversampling the minority class for the stacker - did not improve performance (as for the original GBRT model)
Will try RankSVM as stacker but I don't think it will be better than our GBRT stacker.
I also tried another DBN - starting from your best configuration - strange thing is: when I only use the 16Hz features (mfcc, ceps, specs) I get miserable results...
Adding the magic feature definitely helped. GBRT alone with the magic feature scored 0.97394 on LB (1 single instance, ['2200', 'gbrt', '2500', '8', '0.125', '0.01', '26', '1.0', 'gbrt-2500-xx']). In comparison, the same setting without the magic feature yields 0.97258 on LB.
I am waiting for the other instances to finish (20 are running). Then I'll add them into the stack.