ShifuML / shifu

An end-to-end machine learning and data mining framework on Hadoop
https://github.com/ShifuML/shifu/wiki
Apache License 2.0
252 stars 109 forks source link

Extend Grid Search to Stats/Norm #295

Open zhangpengshan opened 7 years ago

zhangpengshan commented 7 years ago

Like number of binning: 20 or 30, missing value processing type, these should impact model performance in triaining,

how to add such parameters to grid search?

zhangpengshan commented 7 years ago

Like grid search in h2o:

hyper_parameters = {'ntrees':[50],

                'max_depth':list(range(5,10,2)),

                'sample_rate':[x * 0.1 for x in range(8, 11, 2)],

                'nbins':list(range(20,50,10)),

                'min_rows':list(range(10,40,10)),

                'histogram_type': ["UniformAdaptive","QuantilesGlobal","RoundRobin"]

               }