Overall, code looks solid, this definitely matches what I've seen online in previous examples. Just wanted to share a few comments below:
Any time we build a random forest or do cross val (or anything random), we should set a random seed so we can reproduce the numbers we got. See the grid search code for an example!
Looks like you calculate execution time in both experiments but never print it. Could you print it at the end of each experiment code block with:
print("Optuna execution time:", end-start), "s")
For the “best” trial I’m not 100% you are actually applying the right parameters. For example, on the classification problem, trial 84 is the best trial and based on the Optuna output it has parameters
Recommend using the following code to output which trial was the best in its own chunk so it is more obvious to both us and the audience which trial was best:
study.best_trial
And then use the n_estimators, max_depth, and max_features printed there for the best model
Recommend calculating MSE, MAPE, MAE using the same method we did in the grid search to keep the code more consistent across our different experimental conditions, instead of calculating it out of cross_val_score
Would it make more sense to maximize accuracy for classification and minimize MSE for regression, since those are the metrics we care about? I tried this and it seems to produce better scores overall.
We can probably cut out the section at the end on grid search. I think it might complicate things to have some version of grid search blended with our Optuna code, and the code you already have in the other sections should definitely cover it!
I have a version of the code with these edits here if you want to take a look to see what I meant by some of these!
Adding my comments on commit 5de4729!
Overall, code looks solid, this definitely matches what I've seen online in previous examples. Just wanted to share a few comments below:
Any time we build a random forest or do cross val (or anything random), we should set a random seed so we can reproduce the numbers we got. See the grid search code for an example!
Looks like you calculate execution time in both experiments but never print it. Could you print it at the end of each experiment code block with:
For the “best” trial I’m not 100% you are actually applying the right parameters. For example, on the classification problem, trial 84 is the best trial and based on the Optuna output it has parameters
'n_estimators': 257, 'max_depth': 7.725820147961449, 'max_features': 3
But you are using the following in the model, which is not the same:
n_estimators=36, max_depth=7.9399781768472515 ,max_features=3
Recommend using the following code to output which trial was the best in its own chunk so it is more obvious to both us and the audience which trial was best:
study.best_trial
And then use the
n_estimators
,max_depth
, andmax_features
printed there for the best modelRecommend calculating MSE, MAPE, MAE using the same method we did in the grid search to keep the code more consistent across our different experimental conditions, instead of calculating it out of cross_val_score
Would it make more sense to maximize accuracy for classification and minimize MSE for regression, since those are the metrics we care about? I tried this and it seems to produce better scores overall.
We can probably cut out the section at the end on grid search. I think it might complicate things to have some version of grid search blended with our Optuna code, and the code you already have in the other sections should definitely cover it!
I have a version of the code with these edits here if you want to take a look to see what I meant by some of these!