censusreporter / census-api

The home for the API that powers the Census Reporter project.
MIT License
166 stars 50 forks source link

getting all data for an id? #64

Closed yocontra closed 7 years ago

yocontra commented 7 years ago

Only thing I found was https://embed.censusreporter.org/test/04000US02.json which only works on some IDs.

Could be a new endpoint, or make table_ids optional on this one https://api.censusreporter.org/1.0/data/show/latest?geo_ids=04000US01

Right now to pull this you have to make thousands of requests per ID.

Basically all I'm trying to do is take a FIPS code from TIGER data and look up the data for it, but the way the API is structured makes this extremely difficult.

iandees commented 7 years ago

We don't support that because it returns more data than we're comfortable transmitting via API. If you want to use bulk data, please use one of the PostgreSQL dumps.

As you can see from the first URL, it's a test. Where did you find that, by the way?

yocontra commented 7 years ago

@iandees Not sure, was using github search to see how people use the API.

Would you be interested in providing these as static files on S3, one per ID? It seems like the mechanism is already in place for this on https://embed.censusreporter.org/test/<id>.json - I understand you don't want to query this live but don't see why it couldn't be done once then served from cache.

iandees commented 7 years ago

If you don't mind, what's your use-case for pulling this much data?

yocontra commented 7 years ago

@iandees Creating a static module of boundaries https://github.com/contra/boundaries

Anyone can access the geojson files using a flat URL, or use the module to access them. Using the module no network connection is required, completely standalone since the data is just json files in a git repo.

I'm running the process to pull the shapes and data maybe once a year. Augmenting each boundary with more census data has been a feature request for a while. This project seems to have the most sane ACS data which is why I'm here and not pulling 20gb tarballs off FTP.

Really just trying to make this data easier to work with for the public and for some government people.

iandees commented 7 years ago

That's interesting! Can I suggest that you only pull in certain values rather than the whole "width" of ACS for each polygon?

For some background, the original intention of the /test URL you found was to be a way for us to operate the API without a database running 24/7. I intended to write out GeoJSON files for every geometry with every ACS table/column included. The Census Reporter API would use those GeoJSON files to perform any query requested by the existing API.

I ran out of time and the number of files we would have to generate were not practical (mostly because of S3 PUT costs). I probably won't have time (and we don't have the budget) to pull this off in the immediate near term.

yocontra commented 7 years ago

@iandees Sounds exactly like what I'm doing now, but I'm storing my stuff on github so there isn't any cost to run an S3.

Do you have the scripts for generating the files somewhere? I can run it and put the flat files on a repo.

yocontra commented 7 years ago

You might find this interesting: https://github.com/CartoDB/bigmetadata