sfbrigade / datasci-firerisk

This project attempts to model and acquire data from SF OpenData - and other sources - to predict the relative risk of fire in San Francisco’s buildings and public spaces.
http://codeforsanfrancisco.org/projects/SF-Fire-Risk-Project
10 stars 9 forks source link

Create feature data set from 'matched_Fire_Inspections.csv' #10

Open stahlerk opened 6 years ago

stahlerk commented 6 years ago

1) Subset data to potentially useful features 2) Detect and remove outliers 3) Consider organizing "Inspection Type Description" column into more generalized groups (if appropriate) 4) Collapse data at EAS level 5) Create any potentially relevant features (for example, total number of fire inspections b/w 2005-2016 associated with EAS, etc.) 6) Any other data cleaning and standardization operations 7) Output as .csv (indexed at EAS)