ropensci / osmdata

R package for downloading OpenStreetMap data
https://docs.ropensci.org/osmdata
316 stars 46 forks source link

Server-side filtering by area #286

Closed jmaspons closed 1 year ago

jmaspons commented 1 year ago

Addresses #256. It allows passing areas to bbox parameter in opq described by osm type and id (eg. relation(id:1, 2, 3); way(id:2)). New functions getarea and area_to_string help to generate the areas, and they follow the patterns of getbb and bbox_to_sting.

Despite that the implementation works, there are some naming inconsistencies which require to break the API to fix them, hopefully in a backward compatible way. Bbox are areas, but areas are not always bbox. To fix the inconsistencies, I would:

If agreed about the API break, I can make another PR afterwards.

jmaspons commented 1 year ago

https://github.com/ropensci/osmdata/pull/286/commits/c876e9064c267249a40073cfb8a34e640532f637 should be cherry-picked to main if the PR is not merged

mpadge commented 1 year ago

Thanks a heap @jmaspons for all the work you've put into this. It definitely expands functionality in a very useful way, and i would like to work towards merging the PR. One initial comment is, however, that I don't think renaming "bbox" to "area" is desirable or practical at all. "bbox" is used throughout the general spatial ecosystem of R, and it is also exactly what the overpass API itself refers to:

The 'bounding box' defines the map area that the query will include.

The area of a query refers to the set of all data returned by the query within the bounding box, while the bounding box is the external parameter defining the query itself. That latter terminology must also be retained in opq()/getbb(). In other words, a "bbox" is a parameter of a query, whereas an "area" refers to the contents returned by a query. That distinction must be maintained here, such that a "bbox" should never be considered equivalent to an "area". Does that make sense?

In the context of this package, and of your proposed changes here, the function getbb() is used to pass some stuff to Nominatim, and to return a defining "bbox" parameter to pass to an overpass query. So that function must be considered to return a "bbox", and not an "area". Renaming it would only cause confusion, especially whenever anybody tried to understand what this "area" parameter was in the context of the overpass API , where they would just encounter references to "bounding boxes". So: "bbox" must remain, I'm afraid, and the renaming you've proposed to "area" must be reverted.


Other aspects can all stay for the moment, such as get_nominatim_query instead of get_bb_query, which is indeed a good suggestion.


Thanks :smile:

jmaspons commented 1 year ago

Thanks for the feedback! I don't have strong opinions about the nomenclature, but somehow we have to distinguish the two current forms to crop the data: the bounding box as defined by four coordinates, and the area to crop using an arbitrary area using the overpass function map_to_area. Some grep could be enough while keeping the different data formats to opq$bbox.

The less invasive way could be to add a new format_out mode to getbb (eg. osm_type_id), remove the getarea and area_to_string functions and adapt bbox_to_string to the new format. Is that fine for you?

mpadge commented 1 year ago

The less invasive way could be to add a new format_out mode to getbb (eg. osm_type_id), remove the getarea and area_to_string functions and adapt bbox_to_string to the new format. Is that fine for you?

That sounds like a fine solution :+1:

jmaspons commented 1 year ago

The less invasive way could be to add a new format_out mode to getbb (eg. osm_type_id), remove the getarea and area_to_string functions and adapt bbox_to_string to the new format. Is that fine for you?

That sounds like a fine solution +1

Changes done and rebased

jmaspons commented 1 year ago

Can you please just change opq.RL#857 from if (!quiet) to if (!quiet && !map_to_area), to suppress that message for map_to_area queries?

Sure, but the bbox can be as big and the query as heavy as with a squared bounding box. Still?

mpadge commented 1 year ago

Can you please just change opq.RL#857 from if (!quiet) to if (!quiet && !map_to_area), to suppress that message for map_to_area queries?

Sure, but the bbox can be as big and the query as heavy as with a squared bounding box. Still?

Yeah, good point. I guess then leave as is, if you think that's advisable.

mpadge commented 1 year ago

Then i think just one more requst @jmaspons: Could you please add yourself as "ctb" in the 'DESCRIPTION' file? Thanks

jmaspons commented 1 year ago

Done! Thanks :)

mpadge commented 1 year ago

Then one more important question before i merge: Are you willing to help maintain this code for the foreseeable future? I hope so! I'm of course happy to contribute, but can't always guarantee that i'll have the time, so it's always good for large external contributions like this to have the original authors "on board" at all times.

jmaspons commented 1 year ago

I hope so. I will use the package, so I'm interested in it working well

mpadge commented 1 year ago

Thank you for the great contribution!