gw-sd-2016 / crimedata

Ben Carleton
0 stars 0 forks source link

Week 11 Update #1

Closed carletonb closed 8 years ago

carletonb commented 8 years ago

This week I worked on the Google geocode integration and on the data import module. Since we do not have the GW data, I used a subset (222k rows) of the Chicago dataset that @twood02 pointed out to me earlier this week. The import module is basically functional at this point, as is the geocode integration.

Going forward, I will need to write more extensive error-handling logic in the importer because right now we are nulling or zeroing bad fields, which will not be suitable for a production workload.

I will need to write a caching layer for the geocode integrator because it is limited by Google's free-tier API to 2,500 queries a day. Since I'm fairly certain that there are fewer than 2,500 addresses on the FB campus but more than 2,500 crime reports, caching the geocode responses should allow us to avoid duplicate lookups and stay under the limit.

@twood02 @cctoombs