ropensci / osmdata

R package for downloading OpenStreetMap data
https://docs.ropensci.org/osmdata
317 stars 45 forks source link

Accept vectors for values inside add_osm_feature #139

Closed JimShady closed 5 years ago

JimShady commented 6 years ago

Feature request: have the function add_osm_feature accept vectors for value variables. For example:

my_bbox <- c(18.28387, 43.79773, 18.42578, 43.90008)

roads_to_import        <- c('primary', 'secondary', 'motorway', 'trunk', 'tertiary', 'residential', 'primary link','trunk link', 'motorway link', 'secondary link')

 roads                  <- opq(bbox = my_bbox) %>% 
                             add_osm_feature(key = 'highway', value = roads_to_import) %>% 
                             osmdata_sf()
JimShady commented 6 years ago

Nice one @mpadge . Though I now need to re-write some of my code dammit ;-)

loreabad6 commented 5 years ago

Hi @mpadge! Thank you for adding this handy functionality! Just one more extra request on my part: can the vector also include negation values? When I try it now as a part of the vector it just ignores them, if I add an extra add_osm_feature line with the negated values I get an error, so what I need to do is add the negated vectors one by one.

I mainly want to do this because some of the added values are not exactly what I want to get, and therefore just adds up my data with unwanted things.

Let me illustrate it with an example:

amenity_values <- c( 'college','community_centre','dentist', 'clinic','doctors','hospitals','hospital','park','pharmacy', 'school','social_facility','bus station','university' )

amenities <- opq(bbox = osm_bb) %>% add_osm_feature(key ='amenity', value = amenity_values, value_exact = T) %>% osmdata_sf() %>% unique_osmdata()

amenities$osm_points$amenity %>% unique()

Using the points data as an example, gives me:

[1] parking bicycle_parking school pharmacy driving_school [6] doctors dentist community_centre college social_facility [11] hospital parking_entrance

I want to get rid of the parking, because they are not really what I want so I try:

amenity_values2 <- c( 'college','community_centre','dentist', 'clinic','doctors','hospitals','hospital','park','pharmacy', 'school','social_facility','bus station','university', '!parking', '!bicycle_parking', '!parking_entrance' )

amenities <- opq(bbox = osm_bb) %>% add_osm_feature(key ='amenity', value = amenity_values2, value_exact = T) %>% osmdata_sf() %>% unique_osmdata()

amenities$osm_points$amenity %>% unique()

Which gives me the exact same as above.

Then I try:

neg_amenity_values <- c( '!parking', '!bicycle_parking', '!parking_entrance' )

amenities <- opq(bbox = osm_bb) %>% add_osm_feature(key ='amenity', value = amenity_values, value_exact = T) %>% add_osm_feature(key ='amenity', value = neg_amenity_values, value_exact = T) %>% osmdata_sf() %>% unique_osmdata()

Which gives me this error:

Error in slot(dat$osm_points, "data") : cannot get a slot ("data") from an object of type "NULL"


So I have to do an additional add_osm_feature line with each negated value!

So hope this can be added! For now I will just go on with my work around!

mpadge commented 5 years ago

Hi @loreabad6, thanks for taking an interest and jumping in here. What you actually found was a bug which the above commit should have fixed. If you run your query again, you should only see the value entries you specify, and all else should be filtered out. I don't know what bbox you were using, but for example:

> osm_bb <- "city of london"
> amenity_values <- c('college','community_centre','dentist',
                     'clinic','doctors','hospitals','hospital','park','pharmacy',
                     'school','social_facility','bus station','university')
> amenities <- opq(bbox = osm_bb) %>%
    add_osm_feature(key ='amenity', value = amenity_values, value_exact = T) %>%
    osmdata_sf(quiet = FALSE) %>%
    unique_osmdata()
> amenities$osm_points$amenity %>% unique()
[1] NA                 "college"          "university"       "school"           "pharmacy"         "hospital"        
[7] "doctors"          "dentist"          "community_centre" "clinic"           "social_facility" 

Note also that these expressions are regex'ed, so you can just, for example, use value = "hospital*" to get "hospital", "hospitals", and anything else that starts with "hospital".

With this bug fix, I don't think there's any need to combine positive and negative values, because you can get only the positive matches as I show above.

Note on vectorizing negative matches

This still doens't make it straightforward to vectorize negative matches. If you wanted all amenity matches except those listed above, you'd have to do something like this:

amenity_values <- paste0 ("!^(", paste0 (amenity_values, collapse = "|"), ")$")

and then proceed as shown. I don't think this is a particularly good idea, however, as the overpass API interprets negated values to represent queries for all key-value pairs other than the specifically negated ones, so will also return all objects which do not have that key. (That's why the NA appears in the amenity values shown above.) This will then effectively be a request for almost everything. In short: The opq function is written to accept single negative values by simply having value = "!this_value", but I don't currently see any need to extend this to vectors of negated values.

loreabad6 commented 5 years ago

Thank you @mpadge! Of course that makes sense, I was thinking afterwards that the value_exact variable should be helping me with that. I run the code again and works great! Thank you! And also for the tips, the regex expressions work really good. Sorry I forgot to include the boundary, I just copied part of my function (in case you are curious it was Gouda, Netherlands).

On the vectorizing negative matches note, I really appreciate your explanation, it really makes it clear to me how the overpass API works. I understand that this will generate an awfully large amount of data, so maybe I can filter afterwards, as I plan now as a next step to get all the amenities except for the ones listed on the amenity_values vector. But I will keep developing that idea and see how it goes.

Thanks again for the package, I am using it a bit more now, getting more familiar with it and I find it very useful! I bumped into a couple of things here and there that might be an issue, so I'll keep exploring them and let you know! 😉