Open xybljy0122 opened 1 year ago
A couple more notes from Wells team meeting:
How to optimize and what does it mean to under-fit and over-fit?
Hyper-parameter magnitudes will be different for each model
Agenda for tomorrow morning's meeting:
Boosting Models notes:
Tasks Completed:
3/21 Meeting with Mike
Model background
Changing weighted term in equation (Wi):
With a small Wi, the function does not exist in equation, increasing Wi increases its weight on the sum
INC lambda → Wi DEC → complexity DEC
- Likely to make to zero
- Likely to make small, but not to zero
- 10^(-6) to 10^(6)
Can set lambda to zero in LightGBM to be comparable to EBM, then compare it to the optimized LightGBM
Max depth (max leafs): maximum depth of the decision tree of each function
- INC max depth → INC complexity
- INC max depth → INC complexity
INC max depth → INC complexity
Adjust linearly, (1 to 10)
Number of base models (interactions): number of decision trees involved in the sum function
- INC base models → INC complexity
INC base models → INC complexity
Change things on a logarithmic scale (multiple values by ten for each next step)
Next Steps
- Intro, EBM section, black box section, overfitting, underfitting
3/20 meeting w. Wells team
(Optimizing lambda1, 2, drop rate, max_depth) Pi-ml cannot adjust all the parameter all the same time Manually do it one by one
Overfitting We want to look at whether the training data has much higher MSE to testing data The difference between the testing data MSE
How to adjust the parameters: Look at lightGBM library and try to figure it out Rosh can help
Try look into low code for parameter
To-do : Put the results into the midterm presentation slides
In future: robustness tests on the models Look at the robust test graph slope The trend of model performance if we perturbation increase Perturbation - the noise in the data
Add new models to make better comparison A overfitted LGBM model (adjust the parameter to make it overfit) Adjust parameter: make the max depth to 16-20 Underfit: smaller max depth, smaller lambda, make the model simple, so it would underfit