ropensci / osmextract

Download and import OpenStreetMap data from Geofabrik and other providers
https://docs.ropensci.org/osmextract
GNU General Public License v3.0
170 stars 12 forks source link

Error with query #290

Closed StephanLo closed 7 months ago

StephanLo commented 7 months ago

This feels more like a user error than bug, but the error code asked me to make an issue. I'm sorry if this was not the correct place to post! I can ask in discussion forum instead if necessary.

Describe issue I want to download only specific features (specifically, features with tag 'landuse' equal to 'landfill') in osmdata we specify the tag and value. Based on the osmextract vignette, it seems as if this should be possible with type of query I specified below. But, it returns an error as shown. I think this is because I'm doing something wrong, but the error suggested an issue. Any advice on this? Thank you!

test_query <- oe_get(place = "ITS Leeds",
                     query = "WHERE landuse IN ('landfill')")
The input place was matched with: ITS Leeds
##Error: There is an error in the query or in oe_read. Please open a new issue at https://github.com/ropensci/osmextract/issues

Additional context

devtools::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.0 (2023-04-21 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RStudio
 language (EN)
 collate  English_United States.utf8
 ctype    English_United States.utf8
 tz       Asia/Tokyo
 date     2024-04-25
 rstudio  2023.06.2+561 Mountain Hydrangea (desktop)
 pandoc   NA

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.0)
 class         7.3-21  2023-01-23 [2] CRAN (R 4.3.0)
 classInt      0.4-10  2023-09-05 [1] CRAN (R 4.3.3)
 cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.3)
 curl          5.2.1   2024-03-01 [1] CRAN (R 4.3.3)
 DBI           1.2.2   2024-02-16 [1] CRAN (R 4.3.3)
 devtools      2.4.5   2022-10-11 [1] CRAN (R 4.3.1)
 digest        0.6.35  2024-03-11 [1] CRAN (R 4.3.3)
 e1071         1.7-14  2023-12-06 [1] CRAN (R 4.3.3)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.3.0)
 fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
 fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.3)
 glue          1.7.0   2024-01-09 [1] CRAN (R 4.3.3)
 htmltools     0.5.8   2024-03-25 [1] CRAN (R 4.3.3)
 htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.3.3)
 httpuv        1.6.15  2024-03-26 [1] CRAN (R 4.3.3)
 httr          1.4.7   2023-08-15 [1] CRAN (R 4.3.0)
 jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.2)
 KernSmooth    2.23-20 2021-05-03 [2] CRAN (R 4.3.0)
 later         1.3.2   2023-12-06 [1] CRAN (R 4.3.3)
 lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.3)
 magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
 memoise       2.0.1   2021-11-26 [1] CRAN (R 4.3.0)
 mime          0.12    2021-09-28 [1] CRAN (R 4.3.0)
 miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.3.1)
 osmextract  * 0.5.0   2023-08-10 [1] CRAN (R 4.3.3)
 pkgbuild      1.4.4   2024-03-17 [1] CRAN (R 4.3.3)
 pkgload       1.3.4   2024-01-16 [1] CRAN (R 4.3.3)
 profvis       0.3.8   2023-05-02 [1] CRAN (R 4.3.1)
 promises      1.2.1   2023-08-10 [1] CRAN (R 4.3.3)
 proxy         0.4-27  2022-06-09 [1] CRAN (R 4.3.0)
 purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.3)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
 Rcpp          1.0.12  2024-01-09 [1] CRAN (R 4.3.3)
 remotes       2.5.0   2024-03-17 [1] CRAN (R 4.3.3)
 rlang         1.1.3   2024-01-10 [1] CRAN (R 4.3.3)
 rstudioapi    0.16.0  2024-03-24 [1] CRAN (R 4.3.3)
 s2            1.1.6   2023-12-19 [1] CRAN (R 4.3.3)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
 sf          * 1.0-16  2024-03-24 [1] CRAN (R 4.3.3)
 shiny         1.8.1   2024-03-26 [1] CRAN (R 4.3.3)
 stringi       1.8.3   2023-12-11 [1] CRAN (R 4.3.2)
 stringr       1.5.1   2023-11-14 [1] CRAN (R 4.3.3)
 units         0.8-5   2023-11-28 [1] CRAN (R 4.3.3)
 urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.3.1)
 usethis       2.2.3   2024-02-19 [1] CRAN (R 4.3.3)
 vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.3)
 wk            0.9.1   2023-11-29 [1] CRAN (R 4.3.3)
 xtable        1.8-4   2019-04-21 [1] CRAN (R 4.3.1)

agila5 commented 7 months ago

Hi @StephanLo and thank you very much for raising this issue!

The correct syntax for what you need is something like the following (i.e. you also need to specify the SELECT ... and FROM ... parts of the query):

library(osmextract)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright.
#> Check the package website, https://docs.ropensci.org/osmextract/, for more details.
test_query <- oe_get(
  place = "ITS Leeds",
  query = "SELECT * FROM multipolygons WHERE landuse IN ('landfill')"
)
#> The input place was matched with: ITS Leeds
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading query `SELECT * FROM multipolygons WHERE landuse IN ('landfill')'
#> from data source `C:\Users\user\AppData\Roaming\R\data\R\osmextract\test_its-example.gpkg' 
#>   using driver `GPKG'
#> Simple feature collection with 0 features and 25 fields
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> Geodetic CRS:  WGS 84

Created on 2024-04-25 with reprex v2.0.2

Anyway, I agree that the error message is not that informative and that the logic behind oe_read could be improved, so I will adjust it as soon as possible. Meanwhile, let me know if that fixes your error.

StephanLo commented 7 months ago

Thank you for the quick reply @agila5 . It works! Thanks, I just added the following first: extra_tags = "landuse"

An additional question is I noticed that including the query or not doesn't change the download file size. So, I guess it downloads the entire OSM dataset for the 'place', and then reads in the features that match the query? From the documentation it also seems that "query" and "wkt_filter" are applied to the already downloaded .pbf file.

I'm wondering: Is it possible to "filter / query on the server side, and then only download the matching dataset?

If preferred, I can open another issue for the question*

agila5 commented 7 months ago

Thank you for the quick reply @agila5 . It works!

Great!

So, I guess it downloads the entire OSM dataset for the 'place', and then reads in the features that match the query?

Yes, exactly. The query and wkt_filter parameters are directly passed to st_read() and they do not interact with the download process.

I'm wondering: Is it possible to "filter / query on the server side, and then only download the matching dataset?

I'd be happy to be proven wrong but, AFAICT, that's not possible. Moreover, I would also say that the core of osmexctract lies in the definition of several functions to interact with fixed and complete OSM data extracts (obtained from Geofabrik or several other providers). You need to download the data only once, and then you can extract all the information you want. On the other hand, if you need to "filter" the OSM data before the download step, then I would suggest using osmdata. I hope that answers your question :)

StephanLo commented 7 months ago

Dear @agila5
Understood! Thank you for your kind reply.

I guess the way it functions has got to do with the way those providers are set up, so I suppose there is not much scope for controlling the queries from osmextract's side? (I'm wondering if its worth making a feature request, or it anyways not feasible?)

In my case the scope is global, but just for one feature type. But, I don't want to download the whole OSM dataset, So, I will use osmdata, generate a grid, and do it tile for tile, to avoid query time out. Thanks for the kind assistance, and the package!

Stephan

Robinlovelace commented 7 months ago

Sounds like the problem is solved :tada: Just to add though, it should be possible (not sure now or in future) to put together a query from {osmextract} that requests and downloads a .pbf file with the info you need and then reads it in. As far as I recall the Overpass API does NOT return .pbf files of the type imported by {osmextract} so we'd have to use something different.

A recent development in this space is Overture Maps, which allows you to download all instances of anything with a single query, as outlined here: https://docs.overturemaps.org/getting-data/more-queries/

agila5 commented 7 months ago

I guess the way it functions has got to do with the way those providers are set up, so I suppose there is not much scope for controlling the queries from osmextract's side? (I'm wondering if its worth making a feature request, or it anyways not feasible?)

Please create a new feature request, and we'll work on it when/whether it will be possible (see also @Robinlovelace comment).

btw: I slightly modified the message for the type of error that you reported, adding a more informative explanation. Please double-check that it's clear.

StephanLo commented 7 months ago

Thank you very much! I will make a feature request.