smartchicago / chicago-atlas

View citywide information about health trends and take action near you to improve your own health.
http://www.chicagohealthatlas.org/
154 stars 228 forks source link

import zip code geographies #9

Closed derekeder closed 11 years ago

derekeder commented 11 years ago

Some data is provided as aggregated by zip code. The City provides a dataset with these boundary definitions which will need to be imported.

A caveat of these zip codes is some of the CDHP data is aggregated in to multiple zip codes like so:

Question: should we treat these groupings as their own geography? If so, can we expect them to be consistent across datasets and time?

danxoneil commented 11 years ago

Interested in hearing Eric and Jamyia's take on this. My thought is that we should have a flexible system to display data in whatever geographies they are provided to us.

Not sure why some CDPH data is conflated into two or more zips-- can this be fixed? We may decide that it is a requirement that data be delivered in a single geography. I don't know how we can deal with data delivered across multiple geographies.

Eric? Jamyia?

JamyiaClark commented 11 years ago

Good Morning, I agree with the notion of having a flexible system to process various data types. You'll notice how these zip codes are mostly business areas (downtown), but are also residential. These zip codes have been combined to increase sizes to avoid suppressing the data. Eric mentioned that there is a potential solution for this problem. He may want to elaborate more, but the fix will prevent the need of aggregating the data.

danxoneil commented 11 years ago

I think this is fixed, in that we have conflated zips on our site. We do need to have explanations of why this is done. Will this be handled by importing metadata from CDPH? @RoderickJones and @JamyiaClark, do you have an explanation of this practice somewhere on the data portal or anything?

RoderickJones commented 11 years ago

From one of our dataset descriptions U.S. Postal Service ZIP Codes are designed to meet the day-to-day operational needs of the U.S. Postal Service and tend to change more frequently than every ten years. To account for this instability, as well as the emergence of new ZIP codes over the course of the decade and low population estimates in certain ZIP codes (i.e., less than 20,000 residents), the following steps were taken:  The total number of hospitalizations and total population of ZIP codes 60707, 60638, and 60827 were included, regardless of whether the hospitalized individual resided within the Chicago city limits.  60610 includes 60654  60707 includes 60635  60622 includes 60642  60606, 60607, and 60661 are combined  60601, 60602, 60603, 60604, 60605, and 60611 are combined  60827 and 60633 are combined https://data.cityofchicago.org/api/assets/6897A02E-BBE7-469A-8AC2-3BB5D7A4F336

derekeder commented 11 years ago

This description should be added somewhere, but otherwise I believe this task is complete. Shall we close?