Closed rachelywong closed 3 years ago
From discussion in class today:
Machine Learning Plan:
Also, for any functions written we need documentation and sensible tests.
Machine Learning Plan:
- Split data into training and testing
- Label our features (categorical, numerical, binary)
- Create transformers for our features
- Create models{} to test out 2 models (BASELINE DUMMY and RBF SVM and LR) - 571 LAB 4 3.2
- use best. whatever to continue with our best model, based on f1 score? or mean cv score?
- Hyperparameter Optimization with randomized search with best model - 571 LAB 4 3.3
- Hyperparameter Optimization results - Confusion matrix , precision-recall curve, AUC? 573 LAB 1 2.7
- use best model and best hyperparameters on test set
- Use coeff to get top coefficients of best indicators 571 LAB 4 4.1
- extra * find the test set with the most predictive readmission outcome vs not 571 LAB 4 5.2
Also, for any functions written we need documentation and sensible tests.
Thanks for this Rachel! I checked in with Varada regarding this work - specifically for correlated features. If we decide to go ahead with logistic regression, it will make the weights of one of the correlated features larger than the other which will make the interpretability of the coefficients in #9 difficult. But prediction will be okay. I can't remember if we can do this in RBF SVM either. I will double check.
As we discussed in lab:
Scripts and other docs:
Analysis plan:
Submission: @wiwang
@wiwang Please close this issue when we have all confirmed via Slack that we are good to go! and then create version 0.1.0 and submit both links to canvas (repo link and version link)! Thank you
Milestone #2 Tasks: