naskoap / rann

0 stars 0 forks source link

Hyperopt code #14

Closed Kray708 closed 1 year ago

ambertin commented 1 year ago

RE your recent commit @naskoap (6f8a3d1):

Overall looks really good and consistent with what Nelson is doing, which is great!

I do have a few of the same recommendations to keep things consistent though!

  1. Could you use the same section headers if possible, just to keep the same format? (Hospital Readmissions (Classification), Car Emissions (Regression)). I think it's fine to load the data in before, like you're doing, but just want these to match so people can easily swap between the files and jump to the relevant code/recognize the headers right away.

  2. For regression, instead of cross val score, use mean squared error as the objective function.

score=mean_squared_error(y_test,pred)
    return score
  1. For metrics on regression, use the following ones instead of the different metrics you are currently pulling out of cross val score:
print('Mean Absolute Error (MAE):', mean_absolute_error(y_test, y_pred))
print('Mean Absolute Percentage Error (MAPE):', mean_absolute_percentage_error(y_test, y_pred))
print('Mean Squared Error (MSE):', mean_squared_error(y_test, y_pred))

For points 2 and 3, recommending this because that's how I did it in grid search and how Nelson will be modifying the Optuna code, so this way we'll all be calculating identical things!

As a side note, I like your function and use of min/sec combo for time, and definitely think we should use that format across all our code! I can implement this change to the optuna/grid search code later, as part of software engineering improvements that we can make.

ambertin commented 1 year ago

This code is now updated to reflect suggestions, so closing for now!