Proof of Concept and Cross-validation, Training/Test Sets

Thank you for the intriguing paper and research. I had a question regarding the proof of concept exercise. In the case of removing some known cities to approximate the performance with truly lost cities, I have some concerns, at least if this test if approached from the machine learning paradigm of cross-validation and hold-out sets. If the models have been trained without cross-validation and using all the known cities, then simply removing some of these known cities to create a test set after the fact would seem to violate assumptions typically used for cross validation and training and test sets for prediction problems. Besides this caveat, there may be systematic differences for why some cities become lost versus standing the test of time. If this is the case, then the models might be over fit to finding known cities versus lost cities.

uchicago-computation-workshop / ali_hortacsu

Proof of Concept and Cross-validation, Training/Test Sets #36