Results from Optimizing Light GBM Model Discussed:
To optimize we must minimize test error and are unconcerned with training errors:
What are accepted values of MSE for a black box model? It’s difficult to judge and if this is the best that can be achieved with different combinations of hyper parameters, it works.
Can max_depth and num_leaves both be used? First objective is to try to achieve the best MSE, and the best test MSE, out of the lowest test MSE choose the one with the largest train MSE, especially when test MSE does not change much. Optimize max_depth first, change the number of estimators (Hyperparameter tuning) and check combined effects of train and test MSE.
The Overfitted Model looks good-leaves are probably more significant to bins (num_leaves and max_bin) so it can be left in the model.
Under Fitted Model: relax the number of estimators more because MSE isn’t appropriate for an underfit model.
PDE is more volatile so expect to see more noise for overfit and for underfit the relationship won’t look meaningful
Results from Optimizing EBM Model Discussed:
Train & Test MSE were not sensitive so learning rate and num_leaves were included to increase sensitivity - use a more reasonable number of bins instead, perhaps 20. For interaction terms, increase parameters to 100 to 200 or more to augment flexibility. The value 6 was tried after which any increase affected the train but not the test MSE. 6 is too small for interaction terms and 10,00 is too large for max_bins.
For overfitting and underfitting too, use max_bins of 500-1000 and increase interaction terms to around 1000.
Train MSE should get to 0.01
Some Notes:
Find best hyperparameter
Conduct more hyperparameter tuning
Send code over to Nengfeng and Rosh as they would like to test out the code and conduct hyperparameter tuning.
Meeting Minutes:
Some Notes: