choshin84 / learning_memo

personal learning memo
0 stars 0 forks source link

Train / Test data split: Ensure to separate "Entity" unseen #9

Open choshin84 opened 5 years ago

choshin84 commented 5 years ago

Tweet summary

For model robustness, ensure entity in test data should NOT be used in training data set, i.e. state feature in mobile career churn data, or year feature in graduate test result. One of the good way to pick up entity is based on low feature importance.

choshin84 commented 5 years ago

if possible, recommended to use "explicit" cross-validation instead of random k-fold