AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.
https://mlbox.readthedocs.io/en/latest/
Other
1.5k stars 274 forks source link

Testing with Predicting Blood Donation challenge #2

Closed brunosez closed 7 years ago

brunosez commented 7 years ago

Hi, Doing some tests with this challenge https://www.drivendata.org/competitions/2/warm-up-predict-blood-donations/

With minimal understanding I rank around 700 on 2400 ! I must document some questions on how to get features importance how to set up stacking

Rgds Bruno Seznec

AxeldeRomblay commented 7 years ago

Hello Bruno,

Good to see it works ;) I am working on the doc at the moment so don't worry, it will be soon available.

params = {

"stck1__base_estimators" : [ 
    Classifier(strategy = "XGBoost", n_estimators=800, max_depth= 4,subsample = 0.8), 
    Classifier(strategy = "LightGBM",n_estimators=800, learning_rate=0.02, max_depth= 4),  
    Classifier(strategy = "RandomForest",n_estimators=800, max_depth=12, max_features = 0.77), 
    Classifier(strategy = "ExtraTrees",n_estimators=800,max_depth=11, max_features = 0.85), 
    Classifier(strategy = "Linear", penalty="l2", C = 1.8,random_state=29), 
],   
 "stck1__n_folds" : 6, 

 'est__strategy' : "XGBoost", 'est__n_estimators' : 200, 'est__learning_rate': 0.05, 'est__max_depth': 3

}

Then:

prd = Predictor() prd.fit_predict(params ,df)

Let me know if it's ok... Axel

AxeldeRomblay commented 7 years ago

Here is the doc: https://github.com/AxeldeRomblay/MLBox/blob/master/docs/documentation.md

brunosez commented 7 years ago

Hi Axel

In fact with the stacking, I got Traceback (most recent call last): File "mlbox1.py", line 74, in prd.fit_predict(params,df) File "/home/consult/anaconda/lib/python2.7/site-packages/mlbox-0.2.2-py2.7.egg/mlbox/prediction/predictor.py", line 339, in fit_predict raise ValueError("Can not predict") ValueError: Can not predict mlbox1.py.txt

AxeldeRomblay commented 7 years ago

Ok so I've just fixed it and tested on your challenge and now it works... You can re-download/clone the master (or dev) branch. Thank you for reporting me this error and enjoy your competition !

brunosez commented 7 years ago

Hi Axel, Thanks it is corrected, with stacking , my rank increased around 100 ! You can close the issue. It is not clear how to get feature importance ? Note : I will present some pointers to your framework at the Kaggle Paris Meetup the 4th of july, are you available at this date

AxeldeRomblay commented 7 years ago

Great :)

brunosez commented 7 years ago

Ok, it will be announced in the next day, location Equancy near the Trocadero. Thanks to participate. Bruno