rsbivand / rgrass

Interpreted interface between the GRASS geographical information system and R
https://rsbivand.github.io/rgrass/
26 stars 9 forks source link

input argument of execGRASS("v.in.ogr") drops parts after '&' (in web service URL) #30

Closed florisvdh closed 3 years ago

florisvdh commented 3 years ago

It seems that input is not passed in full by execGRASS("v.in.ogr") when it contains '&' (typical for a web service query URL):

library(rgrass7)
#> Loading required package: XML
#> GRASS GIS interface loaded with GRASS version: (GRASS not running)
initGRASS(gisBase = "/usr/lib/grass78", 
          mapset = "PERMANENT", 
          override = TRUE)
#> gisdbase    /tmp/Rtmp9mecAu 
#> location    file26d05e323c67 
#> mapset      PERMANENT 
#> rows        1 
#> columns     1 
#> north       1 
#> south       0 
#> west        0 
#> east        1 
#> nsres       1 
#> ewres       1 
#> projection:
#>  XY location (unprojected)
stringexecGRASS("g.proj -c --quiet epsg=31370")

# works:
system("v.in.ogr input='https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000' output=habbox -o")
execGRASS("g.list", type = "vector")
#> habbox

# fails:
url <- 'https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000'
execGRASS("v.in.ogr", "o", input = url, output = "habbox2")

#> ### Copied from standard error: #########
#> ERROR 1: HTTP error code : 400
#> ERROR: Unable to open data source
#>        <https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS>

Created on 2021-06-22 by the reprex package (v2.0.0)

Standard output and standard error ``` sh x Install the styler package in order to use `style = TRUE`. No projection information available for layer Over-riding projection check Check if OGR layer contains polygons... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36% 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% Creating attribute table for layer ... Default driver / database set to: driver: sqlite database: $GISDBASE/$LOCATION_NAME/$MAPSET/sqlite/sqlite.db Importing 323 features (OGR layer )... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36% 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% ----------------------------------------------------- Registering primitives... ----------------------------------------------------- Cleaning polygons ----------------------------------------------------- Breaking polygons... Breaking polygons (pass 1: select break points)... 1% 3% 5% 7% 9% 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 43% 45% 47% 49% 51% 53% 55% 57% 59% 61% 63% 65% 67% 69% 71% 73% 75% 77% 79% 81% 83% 85% 87% 89% 91% 93% 95% 97% 99% 100% Breaking polygons (pass 2: break at selected points)... 1% 3% 5% 7% 9% 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 43% 45% 47% 49% 51% 53% 55% 57% 59% 61% 63% 65% 67% 69% 71% 73% 75% 77% 79% 81% 83% 85% 87% 89% 91% 93% 95% 97% 99% 100% ----------------------------------------------------- Removing duplicates... 1% 3% 5% 7% 9% 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 43% 45% 47% 49% 51% 53% 55% 57% 59% 61% 63% 65% 67% 69% 71% 73% 75% 77% 79% 81% 83% 85% 87% 89% 91% 93% 95% 97% 99% 100% ----------------------------------------------------- Breaking boundaries... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36% 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% ----------------------------------------------------- Removing duplicates... 1% 3% 5% 7% 9% 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 43% 45% 47% 49% 51% 53% 55% 57% 59% 61% 63% 65% 67% 69% 71% 73% 75% 77% 79% 81% 83% 85% 87% 89% 91% 93% 95% 97% 99% 100% ----------------------------------------------------- Cleaning boundaries at nodes... 1% 3% 5% 7% 9% 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 43% 45% 47% 49% 51% 53% 55% 57% 59% 61% 63% 65% 67% 69% 71% 73% 75% 77% 79% 81% 83% 85% 87% 89% 91% 93% 95% 97% 99% 100% ----------------------------------------------------- Merging boundaries... 2% 5% 8% 11% 14% 17% 20% 23% 26% 29% 32% 35% 38% 41% 44% 47% 50% 53% 56% 59% 62% 65% 68% 71% 74% 77% 80% 83% 86% 89% 92% 95% 98% 100% ----------------------------------------------------- Removing dangles... 1% 3% 5% 7% 9% 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 43% 45% 47% 49% 51% 53% 55% 57% 59% 61% 63% 65% 67% 69% 71% 73% 75% 77% 79% 81% 83% 85% 87% 89% 91% 93% 95% 97% 99% 100% ----------------------------------------------------- Building areas... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36%----------------------------------------------------- Removing bridges... 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% ----------------------------------------------------- Registering primitives... Building areas... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36% 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% Attaching islands... 0% 3% 6% 9% 12% 16% 19% 22% 25% 29% 32% 35% 38% 41% 45% 48% 51% 54% 58% 61% 64% 67% 70% 74% 77% 80% 83% 87% 90% 93% 96% 100% ----------------------------------------------------- Finding centroids for OGR layer ... 0% 3% 6% 9% 12% 15% 18% 21% 24% 27% 30% 33% 36% 39% 42% 45% 48% 51% 54% 57% 60% 63% 66% 69% 72% 75% 78% 81% 84% 87% 90% 93% 96% 99% 100% ----------------------------------------------------- Writing centroids... 2% 5% 8% 11% 14% 17% 20% 23% 26% 29% 32% 35% 38% 41% 44% 47% 50% 53% 56% 59% 62% 65% 68% 71% 74% 77% 80% 83% 86% 89% 92% 95% 98% 100% ----------------------------------------------------- 323 input polygons Total area: 1.40301E+07 (372 areas) Area without category: 997415 (49 areas) ----------------------------------------------------- Copying features... 2% 5% 8% 11% 14% 17% 20% 23% 26% 29% 32% 35% 38% 41% 44% 47% 50% 53% 56% 59% 62% 65% 68% 71% 74% 77% 80% 83% 86% 89% 92% 95% 98% 100% Building topology for vector map ... Registering primitives... Building areas... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36% 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% Attaching islands... 0% 3% 6% 9% 12% 16% 19% 22% 25% 29% 32% 35% 38% 41% 45% 48% 51% 54% 58% 61% 64% 67% 70% 74% 77% 80% 83% 87% 90% 93% 96% 100% Attaching centroids... 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% 20% 22% 24% 26% 28% 30% 32% 34% 36% 38% 40% 42% 44% 46% 48% 50% 52% 54% 56% 58% 60% 62% 64% 66% 68% 70% 72% 74% 76% 78% 80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100% ERROR 1: HTTP error code : 400 ERROR: Unable to open data source ```
Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.0 (2021-05-18) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2021-06-22 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date lib source #> P cli 2.5.0 2021-04-26 [?] CRAN (R 4.1.0) #> P digest 0.6.27 2020-10-24 [?] CRAN (R 4.1.0) #> P evaluate 0.14 2019-05-28 [?] CRAN (R 4.1.0) #> P fs 1.5.0 2020-07-31 [?] CRAN (R 4.1.0) #> P glue 1.4.2 2020-08-27 [?] CRAN (R 4.1.0) #> P highr 0.9 2021-04-16 [?] CRAN (R 4.1.0) #> P htmltools 0.5.1.1 2021-01-22 [?] CRAN (R 4.1.0) #> P knitr 1.33 2021-04-24 [?] CRAN (R 4.1.0) #> P magrittr 2.0.1 2020-11-17 [?] CRAN (R 4.1.0) #> P reprex 2.0.0 2021-04-02 [?] CRAN (R 4.1.0) #> P rgrass7 * 0.2-5 2021-01-29 [?] CRAN (R 4.1.0) #> P rlang 0.4.11 2021-04-30 [?] CRAN (R 4.1.0) #> P rmarkdown 2.8 2021-05-07 [?] CRAN (R 4.1.0) #> P rstudioapi 0.13 2020-11-12 [?] CRAN (R 4.1.0) #> P sessioninfo 1.1.1 2018-11-05 [?] CRAN (R 4.1.0) #> P stringi 1.6.2 2021-05-17 [?] CRAN (R 4.1.0) #> P stringr 1.4.0 2019-02-10 [?] CRAN (R 4.1.0) #> P withr 2.4.2 2021-04-18 [?] CRAN (R 4.1.0) #> P xfun 0.23 2021-05-15 [?] CRAN (R 4.1.0) #> P XML * 3.99-0.6 2021-03-16 [?] CRAN (R 4.1.0) #> P yaml 2.2.1 2020-02-01 [?] CRAN (R 4.1.0) #> #> [1] /media/floris/DATA/PROJECTS/09685_NatuurlijkMilieu/160 Bewerkingen en resultaat/Repos_en_data/n2khab-mne-design_withref_groundwater/110_design_groundwater/020_design_elaborate/renv/library/R-4.1/x86_64-pc-linux-gnu #> [2] /tmp/RtmpEJ7xpr/renv-system-library #> [3] /usr/lib/R/library #> #> P ── Loaded and on-disk path mismatch. ```
rsbivand commented 3 years ago

Nothing works for me, neither system() nor execGRASS(), also when url uses double quotes and I use shQuote(url). I installed GRASS 7.8.6RC1, I now only install GRASS to answer rgrass7 questions.

veroandreo commented 3 years ago

It works when tested in NC location directly in grass 7.8.5 and also with the system() call. Using execGRASS() yields the same error reported.

florisvdh commented 3 years ago

I use GRASS 7.8.5 too (current stable release, from ubuntugis-unstable PPA), maybe that's the reason.

florisvdh commented 3 years ago

Since v.in.ogr relies on GDAL, also the GDAL version might have an effect (3.2.1 here).

rsbivand commented 3 years ago

With reference to https://gdal.org/drivers/vector/wfs.html#vector-wfs, I tried:

url <- 'WFS:"https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&amp;request=GetFeature&amp;typename=BWK%3ABwkhab&amp;bbox=136000%2C192000%2C138000%2C194000"'
execGRASS("v.in.ogr", "o", input = url, output = "habbox2")
execGRASS("g.list", type = "vector")
# habbox2

with GRASS 7.8.6RC1 and GDAL 3.3.0 (adding WFS; and converting & to &amp;). So I think your input was malformed - the driver web page is fairly clear, I think. Please say if this works for you - I don't think your version should have worked at all.

veroandreo commented 3 years ago

Indeed it works with Roger's modifications. I'm using GRASS 7.8.5 and GDAL 3.1.4 (Fedora 33 box). I'd suggest to close this issue.

florisvdh commented 3 years ago

Thanks for pointing me in the right direction Roger (and thanks @veroandreo for following up!). Your solution runs, but the results are different than before (here at least) since the filters for the GetFeature operation are not applied anymore.

In fact, it appears that the URL you suggest (indeed formed similar to the GDAL specification, except for the inner ""):

url <- 'WFS:"https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&amp;request=GetFeature&amp;typename=BWK%3ABwkhab&amp;bbox=136000%2C192000%2C138000%2C194000"'

will result in the same dataset as when using:

url <- 'WFS:"https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs"'

or

url <- 'WFS:"https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS"'

I.e. all 3 layers of the WFS are returned without filtering except for the WFS automatic limit of 10^4 features per layer.

Inspired by your approach, the working solution (at least here, with GDAL 3.2.1) seems to be not converting & (might be a documentation issue at GDAL). I added both the execGRASS() and the system() approach since there is a difference regarding the inner "":

library(rgrass7)
#> Loading required package: XML
#> GRASS GIS interface loaded with GRASS version: (GRASS not running)
initGRASS(gisBase = "/usr/lib/grass78", 
          mapset = "PERMANENT", 
          override = TRUE)
#> gisdbase    /tmp/RtmpBhtIYY 
#> location    file23885b122ef 
#> mapset      PERMANENT 
#> rows        1 
#> columns     1 
#> north       1 
#> south       0 
#> west        0 
#> east        1 
#> nsres       1 
#> ewres       1 
#> projection:
#>  XY location (unprojected)
stringexecGRASS("g.proj -c --quiet epsg=31370")

system("v.in.ogr input='WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000' output=habbox -o")

url <- 'WFS:"https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000"'
sink(tempfile())
execGRASS("v.in.ogr", "o", input = url, output = "habbox2")
sink()

execGRASS("g.list", type = "vector")
#> habbox
#> habbox2
execGRASS("v.category", input = "habbox", option = "report")
#> Layer/table: 1/habbox
#> type       count        min        max
#> point          0          0          0
#> line           0          0          0
#> boundary       0          0          0
#> centroid     323          1        323
#> area           0          0          0
#> face           0          0          0
#> kernel         0          0          0
#> all          323          1        323
execGRASS("v.category", input = "habbox2", option = "report")
#> Layer/table: 1/habbox2
#> type       count        min        max
#> point          0          0          0
#> line           0          0          0
#> boundary       0          0          0
#> centroid     323          1        323
#> area           0          0          0
#> face           0          0          0
#> kernel         0          0          0
#> all          323          1        323

Created on 2021-06-23 by the reprex package (v2.0.0)

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.0 (2021-05-18) #> os Linux Mint 20 #> system x86_64, linux-gnu #> ui X11 #> language nl_BE:nl #> collate nl_BE.UTF-8 #> ctype nl_BE.UTF-8 #> tz Europe/Brussels #> date 2021-06-23 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> cli 2.5.0 2021-04-26 [1] CRAN (R 4.1.0) #> digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) #> knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> reprex 2.0.0 2021-04-02 [1] CRAN (R 4.1.0) #> rgrass7 * 0.2-5 2021-01-29 [1] CRAN (R 4.1.0) #> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) #> rmarkdown 2.8 2021-05-07 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) #> stringi 1.6.2 2021-05-17 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.23 2021-05-15 [1] CRAN (R 4.1.0) #> XML * 3.99-0.6 2021-03-16 [1] CRAN (R 4.1.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> #> [1] /home/floris/lib/R/library #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library ```

So essentially it would need the WFS: prefix only (compared to the initial URL) - can you verify that this also works in GRASS 7.8.6RC1 and GDAL 3.3.0?

Further, I think ideally execGRASS() would accept the same input string as the GRASS command; using inner "" does not work for the latter while it's needed for the former. If you consider this relevant enough to look at further, then I think the issue title is best changed if it's kept open (or make a new issue). Otherwise, can be closed, at least if the latest solution also works for you.

rsbivand commented 3 years ago

I tried without amp; which worked and some other variants:

> library(rgrass7)
Loading required package: XML
GRASS GIS interface loaded with GRASS version: (GRASS not running)
> initGRASS(gisBase = "/home/rsb/topics/grass/g785/grass78", tempdir(), mapset="PERMANENT", override=TRUE)
gisdbase    /tmp/RtmptxmVQN 
location    file3722d1402ef15 
mapset      PERMANENT 
rows        1 
columns     1 
north       1 
south       0 
west        0 
east        1 
nsres       1 
ewres       1 
projection:
 XY location (unprojected) 
> url <- 'WFS:"https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000"'
> execGRASS("v.in.ogr", "o", input = url, output = "habbox2")
Over-riding projection check
Check if OGR layer <BWK:Bwkhab> contains polygons...
 100%
Creating attribute table for layer <BWK:Bwkhab>...
Default driver / database set to:
driver: sqlite
database: $GISDBASE/$LOCATION_NAME/$MAPSET/sqlite/sqlite.db
Importing 323 features (OGR layer <BWK:Bwkhab>)...
 100%
-----------------------------------------------------
Registering primitives...

-----------------------------------------------------
Cleaning polygons
-----------------------------------------------------
Breaking polygons...
Breaking polygons (pass 1: select break points)...
 100%
Breaking polygons (pass 2: break at selected points)...
 100%
-----------------------------------------------------
Removing duplicates...
 100%
-----------------------------------------------------
Breaking boundaries...
 100%
-----------------------------------------------------
Removing duplicates...
 100%
-----------------------------------------------------
Cleaning boundaries at nodes...
 100%
-----------------------------------------------------
Merging boundaries...
 100%
-----------------------------------------------------
Removing dangles...
 100%
-----------------------------------------------------
Building areas...
-----------------------------------------------------
Removing bridges...
 100%
-----------------------------------------------------
Registering primitives...

Building areas...
 100%
Attaching islands...
 100%
-----------------------------------------------------
Finding centroids for OGR layer <BWK:Bwkhab>...
 100%
-----------------------------------------------------
Writing centroids...
 100%
-----------------------------------------------------
323 input polygons
Total area: 1.40301E+07 (372 areas)
Area without category: 997415 (49 areas)
-----------------------------------------------------
Copying features...
 100%
Building topology for vector map <habbox2@PERMANENT>...
Registering primitives...

Building areas...
 100%
Attaching islands...
 100%
Attaching centroids...
 100%
> execGRASS("v.in.ogr", "o", input = url, output = "habbox3")
> url <- 'WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000'
> execGRASS("v.in.ogr", "o", input = url, output = "habbox3")
> Over-riding projection check
WARNING: Illegal vector map name <BWK:Bwkhab>. Character ':' not allowed.
ERROR: Illegal output name <BWK:Bwkhab>
> url <- shQuote("WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000")
> execGRASS("v.in.ogr", "o", input = url, output = "habbox3")
Over-riding projection check
Check if OGR layer <BWK:Bwkhab> contains polygons...
 100%
Creating attribute table for layer <BWK:Bwkhab>...
Importing 323 features (OGR layer <BWK:Bwkhab>)...
 100%
-----------------------------------------------------
Registering primitives...

-----------------------------------------------------
Cleaning polygons
-----------------------------------------------------
Breaking polygons...
Breaking polygons (pass 1: select break points)...
 100%
Breaking polygons (pass 2: break at selected points)...
 100%
-----------------------------------------------------
Removing duplicates...
 100%
-----------------------------------------------------
Breaking boundaries...
 100%
-----------------------------------------------------
Removing duplicates...
 100%
-----------------------------------------------------
Cleaning boundaries at nodes...
 100%
-----------------------------------------------------
Merging boundaries...
 100%
-----------------------------------------------------
Removing dangles...
 100%
-----------------------------------------------------
Building areas...
-----------------------------------------------------
Removing bridges...
 100%
-----------------------------------------------------
Registering primitives...

Building areas...
 100%
Attaching islands...
 100%
-----------------------------------------------------
Finding centroids for OGR layer <BWK:Bwkhab>...
 100%
-----------------------------------------------------
Writing centroids...
 100%
-----------------------------------------------------
323 input polygons
Total area: 1.40301E+07 (372 areas)
Area without category: 997415 (49 areas)
-----------------------------------------------------
Copying features...
 100%
Building topology for vector map <habbox3@PERMANENT>...
Registering primitives...

Building areas...
 100%
Attaching islands...
 100%
Attaching centroids...
 100%
> execGRASS("v.category", input = "habbox2", option = "report")
Layer/table: 1/habbox2
type       count        min        max
point          0          0          0
line           0          0          0
boundary       0          0          0
centroid     323          1        323
area           0          0          0
face           0          0          0
kernel         0          0          0
all          323          1        323
> execGRASS("v.category", input = "habbox3", option = "report")
Layer/table: 1/habbox3
type       count        min        max
point          0          0          0
line           0          0          0
boundary       0          0          0
centroid     323          1        323
area           0          0          0
face           0          0          0
kernel         0          0          0
all          323          1        323

Perhaps the shQuote() one is interesting?

rsbivand commented 3 years ago

The sf variant is:

> url <- 'WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000'
> habbox_sf <- st_read(url)
Reading layer `BWK:Bwkhab' from data source 
  `WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000' 
  using driver `WFS'
Simple feature collection with 323 features and 32 fields
Geometry type: CURVEPOLYGON
Dimension:     XY
Bounding box:  xmin: 132779.6 ymin: 190507.6 xmax: 139693.7 ymax: 197998
Projected CRS: Belge 1972 / Belgian Lambert 72

but CURVEPOLYGON is worrying.

rsbivand commented 3 years ago

The sf variant is:

> url <- 'WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000'
> habbox_sf <- st_read(url)
Reading layer `BWK:Bwkhab' from data source 
  `WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000' 
  using driver `WFS'
Simple feature collection with 323 features and 32 fields
Geometry type: CURVEPOLYGON
Dimension:     XY
Bounding box:  xmin: 132779.6 ymin: 190507.6 xmax: 139693.7 ymax: 197998
Projected CRS: Belge 1972 / Belgian Lambert 72

but CURVEPOLYGON is worrying, work-around:

> st_write(habbox_sf, "habbox_sf.gpkg")
Writing layer `habbox_sf' to data source `habbox_sf.gpkg' using driver `GPKG'
Writing 323 features with 32 fields and geometry type Curve Polygon.
> ?gdal_utils
> gdal_utils('vectortranslate', "habbox_sf.gpkg", "habbox_sf1.gpkg", options=c("-nlt", "CONVERT_TO_LINEAR"))
> habbox_sf1 <- st_read("habbox_sf1.gpkg")
Reading layer `habbox_sf' from data source 
  `/home/rsb/tmp/bigshape/habbox_sf1.gpkg' using driver `GPKG'
Simple feature collection with 323 features and 32 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 132779.6 ymin: 190507.6 xmax: 139693.7 ymax: 197998
Projected CRS: Belge 1972 / Belgian Lambert 72
florisvdh commented 3 years ago

Thanks for investigating Roger.

Perhaps the shQuote() one is interesting?

Agreed! It is the below object. The string, now containing leading and trailing ' and (in R) surrounded by " is exactly what one can pass to the GRASS command, as shown before. Also tested switching the roles of " and ' (manually) - works as well.

> url <- shQuote("WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000")
> url
[1] "'WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000'"

Basically this means that an extra pair of quotes is always needed in execGRASS() regardless of quotes in the GRASS string, but that is consistent.

Issue solved, learned new ways - thanks a lot.

The sf variant is

Sure :+1: - it was just code for demonstration purposes in GRASS (and where I struggled with the string). IMO it's not typically something where one would use GRASS rather than R.

but CURVEPOLYGON is worrying.

We have more often seen these 'exotic' feature types in WFS datasets. See https://github.com/r-spatial/sf/issues/1573 for an alternative sf approach. But that is unsuccessful here since polygons with holes are split as separate polygons (hence 352 polygons instead of 323):

library(sf)
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
url <- "WFS:https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000"
habbox_sf <- read_sf(url)
st_cast(habbox_sf, "GEOMETRYCOLLECTION") %>%
    st_collection_extract("LINESTRING") %>%
    st_cast("POLYGON") 
#> Simple feature collection with 352 features and 32 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 132779.6 ymin: 190507.6 xmax: 139693.7 ymax: 197998
#> Projected CRS: Belge 1972 / Belgian Lambert 72
#> # A tibble: 352 x 33
#>    gml_id     UIDN   OIDN TAG    EVAL  EENH1 EENH2 EENH3 EENH4 EENH5 EENH6 EENH7
#>    <chr>     <dbl>  <dbl> <chr>  <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#>  1 Bwkhab.… 1.17e6 611814 09084… w     lhb   lhi   <NA>  <NA>  <NA>  <NA>  <NA> 
#>  2 Bwkhab.… 1.34e6 780822 18365… w     lh    <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  3 Bwkhab.… 1.34e6 780822 18365… w     lh    <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  4 Bwkhab.… 1.34e6 780822 18365… w     lh    <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  5 Bwkhab.… 6.88e5 665501 47965… z     aev   <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  6 Bwkhab.… 7.10e5 686987 50913… z     hfe   <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  7 Bwkhab.… 1.22e6 625442 11038… w     lhb   <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  8 Bwkhab.… 1.22e6 624858 11020… w     lhb   <NA>  <NA>  <NA>  <NA>  <NA>  <NA> 
#>  9 Bwkhab.… 1.35e6 792787 18241… z     ds    ds-   <NA>  <NA>  <NA>  <NA>  <NA> 
#> 10 Bwkhab.… 1.33e6 769964 18360… wz    sz    gml   mr    <NA>  <NA>  <NA>  <NA> 
#> # … with 342 more rows, and 21 more variables: EENH8 <chr>, V1 <chr>, V2 <chr>,
#> #   V3 <chr>, HERK <chr>, INFO <chr>, BWKLABEL <chr>, HAB1 <chr>, PHAB1 <int>,
#> #   HAB2 <chr>, PHAB2 <int>, HAB3 <chr>, PHAB3 <int>, HAB4 <chr>, PHAB4 <int>,
#> #   HAB5 <chr>, PHAB5 <int>, HERKHAB <chr>, HERKPHAB <chr>, HABLEGENDE <chr>,
#> #   SHAPE <POLYGON [m]>

Created on 2021-06-23 by the reprex package (v2.0.0)

So it seems your alternative with gdal_utils("vectortranslate") (ogr2ogr) would work in general. I'll add your suggestion to the sf issue.

florisvdh commented 3 years ago

Since GDAL also allows to not use the WFS: prefix:

The minimal syntax to open a WFS datasource is : WFS:http://path/to/WFS/service or http://path/to/WFS/service?SERVICE=WFS

... it seems that the plain 'https://...' does work as well - all in all the initial URL for execGRASS() just needed the extra pair of quotation marks around 'https://...':

> url <- "'https://geoservices.informatievlaanderen.be/overdrachtdiensten/BWK/wfs?service=WFS&request=GetFeature&typename=BWK%3ABwkhab&bbox=136000%2C192000%2C138000%2C194000'"
> sink(tempfile())
> execGRASS("v.in.ogr", "o", input = url, output = "habbox3")
> sink()
> execGRASS("g.list", type = "vector")
habbox3
> execGRASS("v.category", input = "habbox3", option = "report")
Layer/table: 1/habbox3
type       count        min        max
point          0          0          0
line           0          0          0
boundary       0          0          0
centroid     323          1        323
area           0          0          0
face           0          0          0
kernel         0          0          0
all          323          1        323

I wish to thank my colleague @hansvancalster for his remark that sf::read_sf() still works without WFS:, which has led to this.

I think this finding finally marks the initial problem.