In order to add a model to our pipeline, we first need to know what type of model performs best, and what features are most useful for that model. There's not really a way to do that without some good ol' fashioned exploratory data analysis.
This issue is to explore the OPA data and other data sets to determine what features are most useful for predicting property values.
Acceptance Criteria:
[x] A markdown document explaining the useful features and an outline of the feature engineering necessary for model training and prediction
[x] Any artifacts (notebooks, R markdown, etc) should be committed to the eda/ folder. Data sources for the exploration should be documented, but the data itself should not be committed to the repository.
In order to add a model to our pipeline, we first need to know what type of model performs best, and what features are most useful for that model. There's not really a way to do that without some good ol' fashioned exploratory data analysis.
This issue is to explore the OPA data and other data sets to determine what features are most useful for predicting property values.
Acceptance Criteria:
eda/
folder. Data sources for the exploration should be documented, but the data itself should not be committed to the repository.