Closed jamesward closed 7 years ago
This NYC Cab data set looks promising: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
Cool. And I realized we don't really need the driver data for predicting demand, just the "request" or pickup data.
I'm making a sample of 2015 NYC Cab data set as the original data set is way too large for development. We can use the full data set to train the model once the whole pipeline is complete.
Fixed by #4
https://s3-us-west-2.amazonaws.com/4740/yellow_tripdata_2015_further_sample.csv.zip
sample of 10000 lines of taxi data 2015
A ride consists of:
In order to do #1 we need a demo data set we can feed into PredictionIO. The data set can't be totally random otherwise our predictions might appear random. So maybe there is public taxi data we can use.