The city of Chicago is home to nearly 3 million people, and it is currently the third most populous city in the US. Furthermore, its Cook County is the second most populous county in the country. Owing to this massive population, there are a range of transport options in the city. One of these is the city's Divvy Bike-sharing system, complete with hundreds of stations and thousands of bikes & scooters. It is currently operated by the ride-sharing company Lyft, and has been in existence for 9 years. With this many trips taking place every day for this long, this makes Divvy's historical trip data an attractive source of time-series data (at least for me :D), especially because the data is updated monthly.
Build a complete end-to-end machine learning system that culminates in a simple frontend which provides the desired predictions in an interactive manner.
A containerised version of the app is available here.
Clone the repository:
$ git clone https://github.com/maadabrandon/Hourly-Divvy-Trip-Predictor
Install Poetry
$ curl -sSL https://install.python-poetry.org | python3 -
Enter the project directory and run:
$ poetry install
Register free accounts on Hopsworks and CometML. Then copy your project names(for both platforms), API keys(again for both platforms), Comet workspace name, and email address into a .env file.
Backfill the Hopsworks feature groups with historical data:
$ make backfill-features
Run the training pipeline:
$ make train-all
Backfill the Hopsworks feature groups with predictions:
$ make backfill-predictions
View the frontend:
$ make frontend