hcarter333 / datasette-enrichments-gmap-geocode

Apache License 2.0
1 stars 1 forks source link

Thanks for the plugin! #9

Open mdav43 opened 8 months ago

mdav43 commented 8 months ago

Hey - just a note to say thanks for the plugin.

I'm in the grey zone where I'm not a programmer but love to play around with tools like datasette.

Any chance we could use a cache for repeat addresses at all? If I had the skills seems like an opportunity to save "query_address","results" cache table of some kind.

hcarter333 commented 8 months ago

Apologies for not seeing this sooner and thank you!

I need to double check, but I think that the nature of enrichments is that the new columns (lat and lng in this case) are stored in the sqlite database. If that's true, a cache is already implemeneted. Then, as far as adding new rows and finding their locations, I beleive you can just apply the enricment to the rows returned by a query that filters down to rows that don't already have values for lat and lng.

mdav43 commented 8 months ago

I guess I was thinking along the lines of if you had an enrichment that ran over 5 rows, and 5 of the rows have identical address - when you run the enrichment e.g. "{{ my_address }}, CA, United States", it would only hit the google API once for that enrichment.

You can just filter for distinct address and run the enrichment then update the remaining rows, but thought it would just be easier to implement cache to speed up the process. (I had a table of 5000 that I was running this over so will give it a try)

From my limited understanding, you could just cheat a little and use the "@cache" directive along the lines of the below . relevant code block here

....
async with httpx.AsyncClient() as client:
    response = await fetch_geocoding_data(client, url, params)
.....

@cache
async def fetch_geocoding_data(client, url, params):
    response = await client.get(url, params=params)
    return response