run for performance - Githubissues

ldocao commented 5 years ago

a simple logistic regression over the 2048 resnet features gives cross validation score of 0.948. XGBoostClassifier gives slightly better results 0.951 on cross validation and default parameters.

hyper parameter C (inverse of regularisation strength) for LR has a strong effect. A grid search finds: In [5]: run predict_tile
best score 0.9691821414460687 {'C': 0.001, 'penalty': 'l2', 'solver': 'liblinear'}

best score 0.969676017384433 {'C': 0.000774263682681127, 'penalty': 'l2', 'solver': 'liblinear'}

ldocao commented 5 years ago

a random forest without grid search gives 0.89

ldocao commented 5 years ago

logisticregression patient_level reaches 0.828

ldocao commented 5 years ago

xgboost 0.847

ldocao commented 5 years ago

with new prediction_tiles.pkl, I can get AUC patient score : 0.86 (mean over 4 runs, cv, k-fold=5) Logistic regression L2

ldocao commented 5 years ago

Ok, I need to check out things.

1/tile-level:

grid search best score k fold is 0.969
train over the entire annotated tile, then predict on train gives 0.95, I'm surprised it's not closer to 1

2/patient level

cross val 0.854
train over the entire training set, 0.861

ldocao commented 5 years ago

adding SMOTE improves tile:

In [6]: run predict_tile.py
Counter({0.0: 7533, 1.0: 566}) without smote AUC 0.9462551384559786 Counter({0.0: 7533, 1.0: 7533}) with smote AUC 0.964535242655584

this lead to improve cross val score over patients

In [5]: run predict_patient.py                                                                       
[[0.94629156 0.8797954  0.79476584 0.83746556 0.93112948]
 [0.87084399 0.83631714 0.89256198 0.95592287 0.83471074]
 [0.82608696 0.90792839 0.87465565 0.93526171 0.83471074]
 [0.79667519 0.84015345 0.83195592 0.9214876  0.93663912]] 
0.8742679644621054 #mean
 0.04998966900720987 #std

ldocao commented 5 years ago

apply smote on patient level also improves by almost 1 point

In [2]: run predict_patient.py
Counter({0: 133, 1: 90}) without smote 0.8502673796791443 Counter({1: 133, 0: 133}) without smote 0.8582887700534759

ldocao commented 5 years ago

Now, I'm consiering XGBoost:

after grid search, AUC is better 0.971

        GRID = {"eta": np.logspace(-3, -1, 5),
125                "learning_rate": np.logspace(-4, -2, 5),
126                "max_depth": [2,3,4],
127                "reg_lambda": np.logspace(-4, 0, 5)}

{'eta': 0.001, 'learning_rate': 0.01, 'max_depth': 4, 'reg_lambda': 0.1}

done within a night with 24 CPU and 25GB RAM

best parameters : but we got the ones at the limit, it might be possible to improve further. A more refined grid search gives

{'eta': 0.0001, 'learning_rate': 0.07943282347242814, 'max_depth': 6, 'reg_lambda': 0.1798870915128788}

ldocao commented 5 years ago

cross validation over best params xgb: In [2]: run predict_tile.py
[0.95007999 0.95490939 0.93873892 0.94702885 0.95349958] 0.9488513474400027 0.0057514908681526485

to be compared with LR: In [1]: run predict_tile.py
[0.95359738 0.95672229 0.94882167 0.9521964 0.972731 ] 0.9568137479978492 0.008353510458876405

ldocao commented 5 years ago

at patient level with xgb, current best params

best score 0.9205303453921555 {'booster': 'gbtree', 'learning_rate': 0.5994842503189409, 'max_depth': 1, 'n_estimators': 70, 'reg_alpha': 0.021544346900318846, 'reg_lambda': 0.0001}

best score 0.9257884235196339 {'booster': 'gbtree', 'colsample_bytree': 1.0, 'learning_rate': 0.5623413251903491, 'max_depth': 1, 'min_child_weight': 1, 'n_estimators': 90, 'reg_alpha': 1e-05, 'reg_lambda': 0.0031622776601683794, 'subsample': 0.8}

ldocao commented 5 years ago

cross validation with SMOTE

ipdb> auc
array([0.9168798 , 0.92710997, 0.85674931, 0.84435262, 0.93112948]) ipdb> auc.mean()
0.89524423495593 ipdb> auc.std()
0.03699483561764395

without SMOTE: ipdb> auc
array([0.92071611, 0.86061381, 0.79614325, 0.87878788, 0.90495868]) ipdb> auc.mean()
0.8722439460872383 ipdb> auc.std()
0.04333405941498701

ldocao / owkin_breast_cancer

run for performance #17