DE vs GridSearch - Githubissues

Settings

Training data should be the same for both techniques, since GridSearch using cross evaluaiton with strafitied sampling, then we need to modify our DE traning data sets as well.

GridSearch:
- Training data: release1+release2(2-fold CrossEvaluation with stratified sampling to split data.)
- Testing data: release3
- generate grid policy: (eg: number_estimators: [50~150])
- randomly pick 3 values for each parameter within given range(e.g:number_estimators:[68,90,133])
- uniformly pick 3 values for each parameter within given range(e.g: nubmer_estimators:[50,100,150])
DE:
- Training data: release1+release2(stratified sampling the data into new_train and new_tuning)
- Testing data: release3
  Results
  
  Scores

Totally we have 17 experiments, the red numbers show the total number of highest scores occured between DE and GridSearch.

dfffc283-ecd7-4ee9-b529-b0c015841816

Running time:

Generally, DE is much faster than GridSearch, 5 times faster given current grid selection policy DE usually run 60~100 evaluations, but GridSearch runs 2x3^4=162 evaluations for CART, 2x3^5 = 488 for RF(2 is because of 2-fold cross evaluation).

More details

Tuning goal: auc, GridSearch: randomly pick Tuning goal: auc, GridSearch: uniformly pick Tuning goal: precision, GridSearch: randomly pick Tuning goal: precision, GridSearch: uniformly pick Tuning goal: f1, GridSearch: randomly pick Tuning goal: f1, GridSearch: uniformly pick

Conclusion

DE is better in terms of scores
DE is faster in terms of running time, and also evaluation numbers.
But sometimes Tuning is not better than default parameters.

ai-se / Caret

DE vs GridSearch #12

Settings

Results

Scores

Running time:

More details

Conclusion