kadyb / rgugik

Download datasets from Polish Head Office of Geodesy and Cartography
https://kadyb.github.io/rgugik/
Other
33 stars 4 forks source link

New datasets #5

Open kadyb opened 4 years ago

kadyb commented 4 years ago

Another source: Warsaw (parcels and buildings geometry)

GreKro commented 4 years ago

Hi !

Do you plan to add the possibility of downloading LiDAR data ?

By the way, you have done nice work with this package!

Best regards

kadyb commented 4 years ago

Hi !

Do you plan to add the possibility of downloading LiDAR data ?

By the way, you have done nice work with this package!

Best regards

Hi, thanks for your interest in this package! We would like to include all datasets provided by GUGiK. However, at this moment, not all datasets are covered by simple sharing services that we can use. I think it will be possible in the future, as with the orthophotomaps. Best regards!

kadyb commented 4 years ago

@GreKro I added possibility of downloading all DEM products including point clouds. Here is example:

remotes::install_github("kadyb/rgugik")

library("sf")
library("lidR")
library("rgugik")

# load your polygon with area of interest
polygon_path = system.file("datasets/search_area.gpkg", package = "rgugik")
polygon = read_sf(polygon_path)

# return df with available datasets
req_df = DEM_request(polygon)

# select LAS files
req_df = req_df[req_df$format == "LAS", ]

# download only first to working directory
tile_download(req_df[1, ])

# load LAS file
las = readLAS("4675_320841_N-33-130-D-b-2-3-3-1.las")
las
#> class        : LAS (v1.2 format 3)
#> memory       : 550.6 Mb 
#> extent       : 359925.3, 360472.2, 512836.6, 513430.8 (xmin, xmax, ymin, ymax)
#> coord. ref.  : NA 
#> area         : 307470.9 units²
#> points       : 6.28 million points
#> density      : 20.41 points/units²
kadyb commented 4 years ago

We have currently implemented all shared datasets (except Ewidencja Gruntów i Budynków, but there is no proper sharing infrastructure, maybe this will change in the future).

Additionally, we could implement the geometry acquisition function (get from API) for selected voivodeships, counties and communes. This would be an alternative to downloading the entire Państwowy Rejestr Granic (375MB).

We could also implement reverse geocoding for coordinates. Enter the coordinates, and as a result you get the geometry of the administrative unit (voivodeships, counties, communes) and/or its name and TERYT.

@Nowosad what do you think?

Nowosad commented 4 years ago

Additionally, we could implement the geometry acquisition function (get from API) for selected voivodeships, counties and communes. This would be an alternative to downloading the entire Państwowy Rejestr Granic (375MB).

That would be great! Revese geocoding could also be nice, but not as useful as the first thing.

kadyb commented 4 years ago

Ewidencja Gruntów i Budynków is provided as a WFS service (not all counties are available). How can we deal with it in R:

kadyb commented 4 years ago

GUGiK added the ability to download the topographic databases for voivodeships and the entire country (not only for individual counties as now).

http://www.gugik.gov.pl/aktualnosci/24.11.2020-ulatwienie-w-pobieraniu-danych-bdot10k

kadyb commented 3 years ago

Recently, some datasets can be downloaded using WFS and WCS services (spatial coverage can be specified).

gsapijaszko commented 1 week ago

Old thread, but I can jump in with proposal. Recently I was interested in EGiB a bit (to compare OSM data against it) and wrote a few functions to download EGiB data. My approach is to scrape the data register and grepl for EGiB. It's almost complete (except Łomża and łomżyński), details below.

url <- "https://integracja.gugik.gov.pl/eziudp/index.php?showall"

t <- rvest::read_html(x = url) |>
  rvest::html_table() |>
  purrr::pluck(1)

tt <- t |>
  subset(grepl("Ewidencja gruntów", `Nazwa zbioru danych`), select = c(3, 4, 7))
names(tt) <- c("ewidencja", "TERYT", "url")

rgugik::county_names |>
  dplyr::left_join(tt, by = "TERYT") |>
  subset(is.na(url))
#>          NAME TERYT LOD2 ewidencja  url
#> 158     Łomża  2062 TRUE      <NA> <NA>
#> 159 łomżyński  2007 TRUE      <NA> <NA>

tt |>
  subset(nchar(TERYT) != 4)
#> # A tibble: 6 × 3
#>   ewidencja                           TERYT      url                            
#>   <chr>                               <chr>      <chr>                          
#> 1 Ewidencja gruntów i budynków (EGIB) 121304     https://kety.geoportal2.pl/map…
#> 2 Ewidencja gruntów i budynków (EGIB) 2007, 2062 https://mapy.geoportal.gov.pl/…
#> 3 Ewidencja gruntów i budynków (EGIB) 241602     https://wms.zawiercie.eu/zawie…
#> 4 Ewidencja gruntów i budynków (EGIB) 121207_3   https://wolbrom.webewid.pl:444…
#> 5 Ewidencja gruntów i budynków (EGIB) 240301_1   https://miastocieszyn.geoporta…
#> 6 Ewidencja gruntów i budynków (EGIB) 240204     https://czechowice.geoportal2.…

To download the data I was using {ows4R} package however faced a issues with some of the services. Finally using sf::st_read() like:

sf::st_read("WFS:https://swidnicki-wms.webewid.pl/iip/ows", layer = "ms:dzialki", 
            options = "CONSIDER_EPSG_AS_URN=YES")

"CONSIDER_EPSG_AS_URN=YES" is required for those WFS services which returns the data in EPSG:217x and GDAL swaps the coordinates.

Will pack it in one pretty function and create PR if you don't mind.

Created on 2024-11-10 with reprex v2.1.1.9000

kadyb commented 1 week ago

@gsapijaszko, good idea, but it will require adding rvest as dependency? Ideally we could use regex, but it will probably be complicated?

gsapijaszko commented 1 week ago

but it will require adding rvest as dependency? Ideally we could use regex, but it will probably be complicated?

The other option would be to prepare a dataset (TERYT+url) on side and ship it with package. In plus: less dependency, in minus: has to be maintained if/when links changes.

kadyb commented 1 week ago

It might even be a better way. The links probably don't change too often.

gsapijaszko commented 1 week ago

@kadyb It's ready for review and PR: https://github.com/gsapijaszko/rgugik/commit/7846506a7f2104216d44b26c1cd552655a3890b9

Currently there is no access to below servers/layers (either doesn't respond, either XML is missing, URL not available):

rgugik::egib_layers |>
  subset(is.na(LAYERS))
#>                NAME TERYT
#> 97  łódzki wschodni  1006
#> 190      strzelecki  1611
#> 223       łomżyński  2007
#> 232           Łomża  2062
#> 235       chojnicki  2202
#> 250          Gdańsk  2261
#> 264      raciborski  2411
#> 265        rybnicki  2412
#> 288          Zabrze  2478
#> 375      świdwiński  3216
#>                                                            URL LAYERS
#> 97  https://lodzkiwschodni.geoportal2.pl/map/geoportal/wfs.php   <NA>
#> 190                       https://mapy.powiatstrzelecki.pl/ggp   <NA>
#> 223                                                       <NA>   <NA>
#> 232                                                       <NA>   <NA>
#> 235                   https://chojnicki-wms.webewid.pl/iip/ows   <NA>
#> 250                     https://ewid-wms.gdansk.gda.pl/iip/ows   <NA>
#> 264       https://raciborz.geoportal2.pl/map/geoportal/wfs.php   <NA>
#> 265         https://rybnik.geoportal2.pl/map/geoportal/wfs.php   <NA>
#> 288                        https://wms.miastozabrze.pl/iip/ows   <NA>
#> 375                  https://swidwinski-wms.webewid.pl/iip/ows   <NA>

Created on 2024-11-13 with reprex v2.1.1

G.

kadyb commented 1 week ago

Could you do Pull Request to this repository? Then I can write comments to code directly.