Closed mpadge closed 7 years ago
Forcing the user to specify what they want at the outset will encourage them not to mis-understand OSM. I think the modularity of get_points()
, get_lines()
and get_polygons()
is good. They can always be combined in a generic function that uses if
statements to decide which output is best suited to the query.
Operational first draft of merge now done, with a working resolution of this question currently being in process_doc
(lines#19-26), yielding the ultimate form of returned data as
list (
osm_nodes=rcpp_get_points (doc),
osm_ways=rcpp_get_lines (doc),
osm_polygons=rcpp_get_polygons (doc)
)
Question nevertheless remains open and definitely worth considering further ...
And another q for @hrbrmstr : You've got an example in your README of returning an [out:csv
query as a data.frame
. The current merged version does not allow this (through forcing [out:xml
only), but as I see it, all such data will always exist in the @data
slot of the sp
objects anyway. Let me know if i might have misunderstood and you in fact see a use for being able to directly return a data.frame
without the intervening spatial guff? Thanks!
sf
has been released on CRAN, so an option can now be incorporated to enable the return of either sp
or sf
objects
Sweet!
Update: I've set overpass_query
to only populate obj
items if it contains data.
My new latest thinking on this: like st_read
in sf now returns the first layer by default if nothing is set, I think the default should be to return the object class with the most elements but emit a message telling the user about what data is getting lost and how to retrieve it (e.g. with an arg like return_list = FALSE
by default but which can be set to TRUE
.
Slightly tweaked your fix to make sure objects (points/lines/polygons
) are returned even if NULL. (And note that an st_read()
-type approach will not work here because points will always win.) Current osmdata
class (demonstrated in the README):
q0 <- opq (bbox=c(-0.12,51.51,-0.11,51.52)) # Central London, U.K.
q1 <- add_feature (q0, key='highway', value='secondary')
bu <- overpass_query (q1)
bu
#> Object of class 'osmdata' with:
#> $bbox : 51.51,-0.12,51.52,-0.11
#> $overpass_call : The call submitted to the overpass API
#> $timestamp : [ Thu Dec 1 11:52:33 2016 ]
#> $osm_points : 'sp' SpatialPointsDataFrame with 27 points
#> $osm_lines : 'sp' SpatialLinesDataFrame with 6 lines
#> $osm_polygons : NULL
Notwithstanding potential sf
migration (see #17), I think this is a pretty reasonable class structure. It'll definitely be type-stable.
Problems?
plot
method, although this would be possible by, for example, simply plotting everything.See this comment for some very important thoughts on the appropriateness of sf
for OSM data
Closing this now, because I've significantly revised get-osmdata.cpp
so it processes all data at once (compared to previous approach of separately processing points, lines, polygons). The three spatial forms are thus now inseparable, and so osmdata
has to return a list with all of them, in the class structure shown above.
Fantastic work, this looks like a great solution.
A question for @Robinlovelace and @hrbrmstr: How do we want the user to control what is returned from overpass? The current
overpass_query
relies, thoughprocess_doc
, on the query being precise enough that it will only return the lowest desired OSM objects (node->way->rel
), whileosmdatar
currently has the three functionsget-points
,get-lines
, andget-polygons
. The two approaches now need to be reconciled withinosmdata
, so I'll begin with an argument for my current approach:overpass
approach relies on the queries being very precise, which they may not always be.way
members rather than the multipolygon relation, yet the extension ofprocess_doc
to include processing of polygons would return just the latter and not the former.osmdatar
approach, yet not theoverpass
approach.bus_station
will return only those mapped as polygons, yet will miss all purely nodal stations.overpass
approach (implicitly extended to polygon extraction) will not allow a map ofbus_stations
as simple points, because a query forbus_stations
will return the highest hierarchical objects, which will be (polygonal) ways (or maybe even multipolygons).Having stated the essence of my thoughts on the matter, I of course acknowledge that Bob's
overpass
approach is superior in one important way of obviating any need for users to specify a desired kind of output. This would be fantastic in a perfectly tidy OSM world, but I fear that the inherent messiness of everything actually requires the user to exert some degree of control, for which the simplest is surely specifying points, lines, or polygons. Thoughts?