hackforla / 311-data

Empowering Neighborhood Associations to improve the analysis of their initiatives using 311 data
https://hackforla.github.io/311-data/
GNU General Public License v3.0
62 stars 64 forks source link

Develop a 311 request forecasting model #1289

Closed nichhk closed 2 years ago

nichhk commented 2 years ago

Overview

It would be interesting to see whether we can accurately forecast the number of created 311 requests for a future date.

Action Items

nichhk commented 2 years ago

@susanklm: Could you write a comment here describing what you've tried, what the results were, and next steps, if any?

susanklm commented 2 years ago

For the 311-data time series analysis, I used Auto Arima to make monthly forecast of the 311 requests number and Facebook Prophet to make daily forecast of the 311 requests number. Both Auto Aurima and Prophet gave a decent forecast.

Root Mean Square Error (RMSE) of the Auto Arima is 14,808.68. This tells us that our model was capable of forecast the average monthly total request in the test set within 14,808.68 of the actual total request. Our month by month total requests range from around 552 to over 23,266. So, it seems the model is not that bad, but not good either.

Similarly, the RMSE of the Prophet is 788.91. This means that our model was capable of forecast the average daily total request in the test set within 788.91 of the actual total request. Our day to day total requests range from around 2 to over 5,073. So, it seems the model is pretty decent.

To make a better predition of the requests number, probably we can try different time forecasting library instead of Auto Arima and Prophet. Also, maybe we can add some features to make the model adapts to the decresing trend that we see since early 2021 (see Monthly Forecast's sesonal_decompose).

Here are the links to the Jupyter Notebook:

Here are the links to the exploratory analysis on 2020 dataset and 2016 - 2021 dataset using Tableau:

susanklm commented 2 years ago

For the time being, I am not sure how to improve the model. Therefore, I won't continue working with this project anymore.