r-spatial / qgisprocess

R package to use QGIS processing algorithms
https://r-spatial.github.io/qgisprocess/
GNU General Public License v3.0
203 stars 20 forks source link

st_as_sf() method for qgis_result may miss the OUTPUT element #76

Closed florisvdh closed 2 years ago

florisvdh commented 2 years ago

Interestingly, st_as_sf() can be used to process a qgis_result object in order to extract the resulting spatial geometry file or layer. Some algorithms provide more than one output. At least in below case the first one is not the primary output, which makes st_as_sf() return another layer than the desired one:

library(qgisprocess)
#> Using 'qgis_process' at '/usr/bin/qgis_process'.
#> QGIS version: 3.22.2-Białowieża
#> Configuration loaded from '~/.cache/R-qgisprocess/cache-0.0.0.9000.rds'
#> Run `qgis_configure()` for details.
library(sf)
#> Linking to GEOS 3.9.1, GDAL 3.3.2, PROJ 7.2.1

curl::curl_download(
  "https://github.com/qgis/QGIS/raw/master/resources/data/world_map.gpkg",
  "world_map.gpkg"
)

result <- 
  qgis_run_algorithm(
    "native:extractbyattribute",
    FIELD = "NAME",
    INPUT = "world_map.gpkg|layername=countries",
    OPERATOR = 0,
    VALUE = "Norway"
  )
#> Using `OUTPUT = qgis_tmp_vector()`
#> Using `FAIL_OUTPUT = qgis_tmp_vector()`
#> Running /usr/bin/qgis_process run 'native:extractbyattribute' \
#>   '--INPUT=world_map.gpkg|layername=countries' '--FIELD=NAME' '--OPERATOR=0' \
#>   '--VALUE=Norway' \
#>   '--OUTPUT=/tmp/RtmpsW8OE7/filee48a47b052c/filee48a630063bb.gpkg' \
#>   '--FAIL_OUTPUT=/tmp/RtmpsW8OE7/filee48a47b052c/filee48a76f8817e.gpkg'
#> qt5ct: using qt5ct plugin
#> 
#> ----------------
#> Inputs
#> ----------------
#> 
#> FAIL_OUTPUT: /tmp/RtmpsW8OE7/filee48a47b052c/filee48a76f8817e.gpkg
#> FIELD:   NAME
#> INPUT:   world_map.gpkg|layername=countries
#> OPERATOR:    0
#> OUTPUT:  /tmp/RtmpsW8OE7/filee48a47b052c/filee48a630063bb.gpkg
#> VALUE:   Norway
#> 
#> 
#> 0...10...20...30...40...50...60...70...80...90...
#> ----------------
#> Results
#> ----------------
#> 
#> FAIL_OUTPUT: /tmp/RtmpsW8OE7/filee48a47b052c/filee48a76f8817e.gpkg
#> OUTPUT:  /tmp/RtmpsW8OE7/filee48a47b052c/filee48a630063bb.gpkg

str(result, 1)
#> List of 6
#>  $ FAIL_OUTPUT     : 'qgis_outputVector' chr "/tmp/RtmpsW8OE7/filee48a47b052c/filee48a76f8817e.gpkg"
#>  $ OUTPUT          : 'qgis_outputVector' chr "/tmp/RtmpsW8OE7/filee48a47b052c/filee48a630063bb.gpkg"
#>  $ .algorithm      : chr "native:extractbyattribute"
#>  $ .args           :List of 6
#>  $ .raw_json_input : NULL
#>  $ .processx_result:List of 4
#>  - attr(*, "class")= chr "qgis_result"

st_as_sf(result)
#> Simple feature collection with 239 features and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -179.9 ymin: -89.9 xmax: 179.9 ymax: 83.6341
#> Geodetic CRS:  WGS 84
#> # A tibble: 239 × 7
#>    iso_a2 NAME     FIPS_10_ ISO_A3 WB_A2 WB_A3                              geom
#>    <chr>  <chr>    <chr>    <chr>  <chr> <chr>                <MULTIPOLYGON [°]>
#>  1 SE     Sweden   SW       SWE    SE    SWE   (((15.70533 56.1164, 15.7269 56.…
#>  2 DE     Germany  GM       DEU    DE    DEU   (((6.798106 53.60444, 6.722423 5…
#>  3 NL     Netherl… NL       NLD    NL    NLD   (((-68.21154 12.22809, -68.19001…
#>  4 RU     Russia   RS       RUS    RU    RUS   (((47.99708 45.46784, 47.99684 4…
#>  5 KH     Cambodia CB       KHM    KH    KHM   (((103.3342 10.58393, 103.3359 1…
#>  6 HR     Croatia  HR       HRV    HR    HRV   (((17.65895 42.74299, 17.74887 4…
#>  7 MM     Myanmar  BM       MMR    MM    MMR   (((98.05747 9.796861, 98.01832 9…
#>  8 VN     Vietnam  VM       VNM    VN    VNM   (((106.6626 8.739814, 106.6392 8…
#>  9 CA     Canada   CA       CAN    CA    CAN   (((-54.65689 49.46572, -54.68545…
#> 10 US     United … US       USA    US    USA   (((-155.9121 19.09619, -155.9199…
#> # … with 229 more rows
(read_sf(result$OUTPUT))
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -9.117421 ymin: -54.45566 xmax: 33.64039 ymax: 80.76948
#> Geodetic CRS:  WGS 84
#> # A tibble: 1 × 7
#>   iso_a2 NAME   FIPS_10_ ISO_A3 WB_A2 WB_A3                                 geom
#>   <chr>  <chr>  <chr>    <chr>  <chr> <chr>                   <MULTIPOLYGON [°]>
#> 1 NO     Norway NO       NOR    NO    NOR   (((3.457286 -54.39007, 3.486664 -54…

Created on 2022-01-14 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.2 (2021-11-01) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2022-01-14 #> pandoc 2.14.0.3 @ /usr/lib/rstudio/bin/pandoc/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> class 7.3-19 2021-05-03 [4] CRAN (R 4.1.0) #> classInt 0.4-3 2020-04-07 [1] CRAN (R 4.1.0) #> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.2) #> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.2) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.1) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) #> digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2) #> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.1) #> e1071 1.7-9 2021-09-16 [1] CRAN (R 4.1.1) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.2) #> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.2) #> glue 1.5.1 2021-11-30 [1] CRAN (R 4.1.2) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1) #> KernSmooth 2.23-20 2021-05-03 [4] CRAN (R 4.1.0) #> knitr 1.36 2021-09-29 [1] CRAN (R 4.1.1) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) #> proxy 0.4-26 2021-06-07 [1] CRAN (R 4.1.0) #> ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> qgisprocess * 0.0.0.9000 2022-01-14 [1] Github (paleolimbot/qgisprocess@8f421b4) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1) #> rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.0) #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.1) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.1) #> rlang 0.99.0.9009 2021-11-18 [1] local #> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.1) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2) #> sf * 1.0-4 2021-11-14 [1] CRAN (R 4.1.2) #> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.2) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> units 0.7-2 2021-06-08 [1] CRAN (R 4.1.1) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.1) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> withr 2.4.3 2021-11-30 [1] CRAN (R 4.1.2) #> xfun 0.29 2021-12-14 [1] CRAN (R 4.1.2) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Currently the first usable output is taken by st_as_sf() (below). While element names OUTPUT or output are not always present for all algorithms, it seems to make sense to prefer such element, if present, over the first usable one.

https://github.com/paleolimbot/qgisprocess/blob/75d90b008a2f4d3aff6fd09e22a29f8996b381ce/R/compat-sf.R#L21-L31

I could try to add that with a PR if you like this suggestion.

florisvdh commented 2 years ago

Above, 8f421b4 (PR #74) was used (where you mentioned that order does change in JSON output) - however it's the same with current master (paleolimbot/qgisprocess@75d90b0) for this case. No JSON output was used here.

paleolimbot commented 2 years ago

As you noted, this is a problem with or without JSON output, so better scope for its own PR. After JSON output we don't have reliable access to the ordering, so preferring the OUTPUT or output element would be a better heuristic. I'd love it if you'd like to PR!