paul-carteron / happign

Happign allows you to use the APIs provided by the IGN (France) to download their public data.
https://paul-carteron.github.io/happign/
GNU General Public License v3.0
27 stars 3 forks source link

Downloading WFS data for whole of France #9

Closed lionel68 closed 1 year ago

lionel68 commented 1 year ago

Hello,

I am starting to work with happign and I am usually interested to get spatial data across the french territory. I got the french boundaries from gadm.org and when I run:

fr <- st_read("LIF/IFN_stuff/data/geodata/gadm41_FRA.gpkg")
get_wfs(shape = fr,
         apikey = "administratif",
         layer_name = "ADMINEXPRESS-COG-CARTO.LATEST:region")

Error in `resp_abort()`:
! HTTP 502 Bad Gateway.

Here is the bbox of the fr object (in crs 4326):

st_bbox(fr)
     xmin      ymin      xmax      ymax 
-5.143751 41.333752  9.560416 51.089397 

What is the issue?

Thanks in advance, Lionel

paul-carteron commented 1 year ago

Hi Lionel,

You're request is good.

I get the same error or 5XX type errors which means that it is the server side that is broken eg IGN server. I can't do much for the moment, but I'll come back to you when the IGN servers are available again.

Having never used get_wfs() on whole France, I am curious to see if happign can succeed.

Have a nice day

paul-carteron commented 1 year ago

I just tried again with the code below and everything is working :

library(sf)
library(happign)

bbox <- st_bbox(c(xmin = -5.143751, xmax = 9.560416, ymax = 51.089397, ymin = 41.333752),
                crs = st_crs(4326)) |> 
  st_as_sfc() |> 
  st_sf()

res <- get_wfs(shape = bbox,
        apikey = "administratif",
        layer_name = "ADMINEXPRESS-COG.LATEST:region")

Can you confirm that everything is functional on your side?

I would also like to draw your attention to the multiplicity of data sources to obtain the region contours:

If you have the opportunity to make a comparison of the products, I am interested.

lionel68 commented 1 year ago

Hi @paul-carteron,

Thanks for your swift answer and sorry was out and about in the last days.

I can confirm that I can obtain the region contours from 'ADMINEXPRESS-COG-CARTO.LATEST:region' and 'ADMINEXPRESS-COG.LATEST:region', the second one being much faster to obtain than the first. When trying 'ADMINISTRATIVE_LIMITS_EXPRESS.LATEST:region' I get:

res3 <- get_wfs(shape = bbox,
                apikey = "administratif",
                layer_name = "ADMINISTRATIVE_LIMITS_EXPRESS.LATEST:region")

Error in `resp_abort()`:
! HTTP 403 Forbidden.
lionel68 commented 1 year ago

Maybe it might be worth adding an example along these lines (using bounding box) in the help page of the get_wfs function? Maybe also warning users about potential limits for large bounding box in layers with loads of geometries (forests ...)?

paul-carteron commented 1 year ago

Hi @lionel68, Thank for testing it again.

About layer_name : For the third ressources, the layer_name is now : "LIMITES_ADMINISTRATIVES_EXPRESS.LATEST:region". Looks like the name evolved since last message. This may explain why there were slowdowns . To be sure you have good layer_name ressource you can use :

layers_name <- get_layers_metadata(apikey, "wfs")

get_layers_metadata connects directly to WFS or WMS service to retrieve name so it's always updated.

Concerning help page : There is no problem when using class bbox object with get_wfs because the function returns an error explaining that shape must be of class sf or sfc. It's more about how to convert bbox to sf or sfc which is task for sf package. For me, it should not be added to the help page. For information, the function returns the data intersected with the bbox of the shape.

I will add warning about large shape inside help page as you suggest. I know that for big shapes downloading can be long, but there is 1 hour of download allowed. Have you ever get stuck ?

About the download time : I run 10 time each "ADMINEXPRESS-COG-CARTO.LATEST:region" (1) and "ADMINEXPRESS-COG.LATEST:region" (2), here the result :

           min     lq      mean      median      uq      max      neval
 (1)     11.30334 12.15797 15.59620 16.90956 17.69519 18.07010    10
 (2)     56.78452 70.63304 73.55518 75.34150 78.90925 82.29761    10

For me the second resource is much longer to download on average. In reality the time is very dependent on the connection, it is better to refer to the weight of the downloaded shapefile. I found that the second one is 32467Ko against 5989Ko for the first which confirm my benchmark.

lionel68 commented 1 year ago

Thanks the issue is solved, will close it.