code4sac / wicit

A simple node/express app for finding locations that accept WIC in California, using data from the new California Department of Public Health open data portal.
http://findwic.com/
MIT License
19 stars 20 forks source link

CHHS data endpoint has changed #50

Closed KalebClark closed 6 years ago

KalebClark commented 7 years ago

It looks like CHHS has changed the data endpoint for the data we are pulling. This needs to be updated.

New endpoint is on CKAN here:

https://data.chhs.ca.gov/dataset/women-infants-and-children-wic-authorized-vendors/resource/ee10b67b-2b93-47e7-aa41-cecfbbd32e17

kelfink commented 6 years ago

I've made a pull request. The solution I made fixes the endpoint problem and successfully parses locations for mapping. Unfortunately the new endpoint is not yet enabled for geo queries, and so it shows arbitrary points. I limited the query to the keyword SACRAMENTO which improves the results, but is still not ideal. CHHS support staff said they would look into fixing the geo-query index.

kelfink commented 6 years ago

Recently worked out with a fellow code4sac'r that the query on the new CHHS CKAN system can by box-bounded by way of something like the following:

The query can be done using SQL, according to CKAN instructions... an encoded query looks like this: SELECT%20*%20FROM%20%22ee10b67b-2b93-47e7-aa41-cecfbbd32e17%22%20WHERE%20Longitude%20%3C%20-121.0

An example curl command: curl https://data.chhs.ca.gov/api/3/action/datastore_search_sql?sql=SELECT%20%2A%20FROM%20%22ee10b67b-2b93-47e7-aa41-cecfbbd32e17%22%20WHERE%20%22Longitude%22%20between%20-121.494050%20%20and%20-121.480531%20%20AND%20%22Latitude%22%20between%2038.562691%20and%2038.578763%20

hands back these results:

{"help": "https://data.chhs.ca.gov/api/3/action/help_show?name=datastore_search_sql", "success": true, "result": {"records": [{"City": "SACRAMENTO", "Second Address": " ", "Vendor": "SAFEWAY #2684", "Zip Code": " 95814", "_full_text": "'-121.48590':4 '1814':5 '19th':6 '2684':10 '38.5681380':3 '95814':8 'sacramento':1,2 'safeway':9 'st':7", "Longitude": "-121.48590", "County": " SACRAMENTO", "Address": " 1814 19TH ST", "Latitude": "38.5681380", "_id": 2203}], "fields": [{"type": "int4", "id": "_id"}, {"type": "tsvector", "id": "_full_text"}, {"type": "text", "id": "Vendor"}, {"type": "text", "id": "Address"}, {"type": "text", "id": "Second Address"}, {"type": "text", "id": "City"}, {"type": "text", "id": "Zip Code"}, {"type": "text", "id": "County"}, {"type": "numeric", "id": "Latitude"}, {"type": "numeric", "id": "Longitude"}], "sql": "SELECT * FROM \"ee10b67b-2b93-47e7-aa41-cecfbbd32e17\" WHERE \"Longitude\" between -121.494050 and -121.480531 AND \"Latitude\" between 38.562691 and 38.578763 "}}

We'll have to adjust the newly pushed master branch to perform a similar query based on the user's location. It should be roughly equivalent to the query we had before in the production site. Note the response JSON is not the same format as the original, so we need to parse it differently.

kelfink commented 6 years ago

We are live on the new endpoint using new code to interpret the new responses