ropensci / opencage

:globe_with_meridians: R package for the OpenCage API -- both forward and reverse geocoding :globe_with_meridians:
https://docs.ropensci.org/opencage
87 stars 11 forks source link

oc_forward and oc_reverse output to data frame #40

Closed jessesadler closed 6 years ago

jessesadler commented 6 years ago

oc_forward and oc_reverse give output in the form of a list of lists. For a oc_forward_df function there needs to be a way to go from the list to a data frame of results. This would build off the opencage_format, but possibly find a cleaner way to do it. 🤞I will try to work on this and report back.

maelle commented 6 years ago

Ok so you like the current output format of opencage_forward for instance?

jessesadler commented 6 years ago

Turns out there is a simple way to get from the current output of oc_forward and oc_reverse to a data frame of results through jsonlite by changing simplifyVector = FALSE to flatten = TRUE. With this change the output is still a list with lists, but the result list contains a data frame, so there is no need to mess around with either the apply or map family.

dpprdan commented 6 years ago

Some devil in the back of my mind tells me that there is an issue with using flatten = TRUE, but I cannot remember what it was (or whether there really is a potential problem with it generally and whether that applies to our scenario). Maybe it is just the first paragraph here? I'd say we just go with it now, if no one objects.

maelle commented 6 years ago

sounds good. if we see an issue, we'll repair it and add an unit test.

jessesadler commented 6 years ago

I tried to do the unnesting of the lists with purrr, but I could not get as nice of an output as with flatten = TRUE. One issue is that purrr functions drop the list names as you flatten levels. This results in multiple columns called “lat” and “lng” due to the bounds list, which is suboptimal. flatten = TRUE solves the column naming problem, by stringing the names of lists together with “.” like in the original opencage_ format, but this can be cleaned up with some regex.

maelle commented 6 years ago

reg names cleaning you could have a look at the code of janitor::clean_names (and if we end up using the code rather than have a dependency, we could if we do like written here)

jessesadler commented 6 years ago

I got the column names to clean up with some pretty simple regex.

jessesadler commented 6 years ago

Closed with pull request #58