Open derekeder opened 10 years ago
@evz lets pick one of these that is the most different from the CensusReporter API and work on incorporating it in to the Geomancer data sources
PANDA would provide the greatest flexibility for the end user, but it may not be the best use case for developing the extensible API wrapper.
It does introduce issues the other APIs don't, though. For example, how do we identify data sets in PANDA that can be merged via Geomancer? (How can the PANDA user identify new data sets that should be made available to Geomancer without requiring a change to the API wrapper?)
@tthibo PANDA may be the next best data source to integrate. However, we don't have a PANDA install available and it would take time to set one up. Does AP have one we could test with?
If not, the next best candidate would be USASpending, as the data comes in XML format, which we don't handle yet.
The AP's PANDA install is behind the firewall. We can set one up on a world-facing server, but that would take some time, as you mention. In the meantime we could consider using their public demo: http://demo.pandaproject.net/#login
Or if it makes more sense to give USASpending a shot, I'm fine with that. I'll be honest, that was one recommended by a reporter, but I'm not quite clear on the use case for it. Does that API provide data aggregated at the geographical level, or does it only provide data at the contract level, available by state, for example?
@tthibo Looks like you can get the contracts summarized by vendor location or by performance location. Locations can looked up by Congressional District, State, Zip Code, or City. There are varying degrees of detail that you can get back, the most general being totals by whatever your search criteria is.
Same things are true about the Federal Assistance and Federal Sub-awards endpoints.
That aside, I am also looking at the Bureau of Labor Statistics stuff. That might be another good case for integrating mainly because it's a multistep process to get to the numbers.
@tthibo do you have a sense of which data sources would be the most valuable to add next?
Census: http://www.census.gov/developers/ BJS: http://www.bjs.gov/developer/ncvs/index.cfm Dept. of Labor (especially BLS): http://developer.dol.gov/ EPA: http://developer.epa.gov/
We have a decent start on BLS and could wrap that one up pretty quick. Any others you want us to investigate?
Let's do BLS next, then. After that, I'd do BJS. I think decennial Census will be great to have, but since we already have ACS, it's a little less pressing. I do have other ideas, but those listed here would trump any additional sources.
The only API that BJS has listed in its data tools is the natl crime victimization survey (ncvs)
in the ncvs field descriptions (personal and household), the only geography in the data is region (i.e. Northeast, Midwest, South, West)
Suggestion from the NICAR session - add country
geotype and international data from the World Bank http://data.worldbank.org/
Another suggesting from NICAR, OpenElections data: https://github.com/openelections
@zstumgoren would know something about this :smile_cat:
We also had another vote for decennial Census at the NICAR session.
We should pick one of these to work on next, keeping in mind we want to build an extensible API wrapper that plugs in to Geomancer. This is a similar approach to what Open Civic Data does: http://opencivicdata.readthedocs.org/en/latest/scrape/index.html