Closed katrinleinweber closed 3 weeks ago
What's the best R package for geocoding?
I would say https://cran.r-project.org/web/packages/tidygeocoder/index.html although I havenāt use it extensively
EDIT
I am not sure if returns the bounding box, that is required in the coords
object and probably in the Twitter API
EDIT2
It can with geo(full_results = TRUE)
, seems to be a good option. Imports tibble, dplyr, httr, jsonlite
This looks like a good candidate
https://docs.ropensci.org/opencage/
EDIT: No Google Maps support, dedicated to a single provider
OK, so I have doing some research and I have found some interesting things that may impact some of the issues related with lookup_coords
:
bounds
This is described in the API docs: https://developers.google.com/maps/documentation/geocoding/overview#results
bounds (optionally returned) stores the bounding box which can fully contain the returned result.
This is the variable used on lookup_coors
:
Potential alternative: Using viewport
variable. From the API docs:
viewport contains the recommended viewport for displaying the returned result, specified as two latitude,longitude values defining the southwest and northeast corner of the viewport bounding box. Generally the viewport is used to frame a result when displaying it to a user.
See an example that returns both bounds
and viewport
:
https://developers-dot-devsite-v2-prod.appspot.com/maps/documentation/utils/geocoder#q%3DUnited%2520States%2520of%2520America
Note the difference between the viewport (mainland USA) vs bounds (including also Hawaii, Alaska, etc), that is exactly what these lines try to do:
https://github.com/ropensci/rtweet/blob/f45b9b3e20275aef6171f6f109ab6e2dba89aa7c/R/coords.R#L61-L72
Find an example of a query not returning bounds
. This seems the case for narrower searchs (zoom out to see the viewport, blue line):
https://developers-dot-devsite-v2-prod.appspot.com/maps/documentation/utils/geocoder#q%3DTimes%2520Square%2520NY
On #391 I added an alternative for geocoding using Nominatim, that does not require API Key and seems to be reliable enough:
Following @hadley suggestion, I did some research (and a call to rspatial comumunity on Twitter, https://twitter.com/dhernangomez/status/1365676793299148803?s=20) and so far it seems to me that https://github.com/jessecambon/tidygeocoder could be the best alternative for the {rtweet} package if this is the preferred way forward.
The function geo
allows the user to use several geocoders (including Google and Nominatim), and would be easily implemented. Some adjustments to the environment variables of both packages would be neccesary.
Update: {tidygeocoder} v1.0.3 now supports 12 geocoding services, including all the majors: see https://jessecambon.github.io/tidygeocoder/articles/geocoder_services.html. At least OSM and ArcGIS have global coverage without the need of an API Key, ping @jessecambon
I think there are ways to improve this function (using viewport
, moving to another free geocoders, fallbacks, using another packages...) but I am not sure if this is a priority right now for {rtweet}.
I would be happy to help if needed, but it seems to me that it would require some work so by now I would leave it as is. If you want me to help just ping me!
Yeah, it is not a priority, so I leave for a while as is. I lend towards Nominatim, the one from Open Street Map, not sure which package would be better, but when we set on this we'll discuss it.
This package has been archived; you can request for it to be unarchived if you opt to resume maintenance, for that please contact rOpenSci.
Problem
lookup_coords
currently relies on Google Maps and a few hard-coded coordinates (#261).Expected behavior
Since there are several other options, would it be feasible to implementing at least one of those, so that people can choose not to be dependent on Google?