Great job formatting the data. One thing about using pop: it is an inefficient method of deleting items, since it essentially does two operations: lookup then delete. Just a small thing.
The bigger issue is a typo in your cross validate formula. You maintained the X_train, y_train, X_test, and y_test variables but those aren't defined in the function itself. This means that you are using the same sets you defined in train_test_split every time and not calculating a new cross validation score. The reason I noticed this is because getting a 1.0 for cross validation seemed very suspect.
Hi Caroline,
Great job formatting the data. One thing about using pop: it is an inefficient method of deleting items, since it essentially does two operations: lookup then delete. Just a small thing.
The bigger issue is a typo in your cross validate formula. You maintained the X_train, y_train, X_test, and y_test variables but those aren't defined in the function itself. This means that you are using the same sets you defined in train_test_split every time and not calculating a new cross validation score. The reason I noticed this is because getting a 1.0 for cross validation seemed very suspect.
I hope this helps and see you in class!
Best, Justin
@ghego @craigsakuma @kebaler