kadyb / rgugik

Download datasets from Polish Head Office of Geodesy and Cartography
https://kadyb.github.io/rgugik/
Other
33 stars 4 forks source link

Outdir #18

Closed Nowosad closed 4 years ago

Nowosad commented 4 years ago

Adds outdir and unzip arguments (where valid)

Nowosad commented 4 years ago

@kadyb please try it and merge when everything is fine

kadyb commented 4 years ago

Thanks, I tested by this script:

remotes::install_github("kadyb/rgugik@outdir")

library("rgugik")

# geodb_download
geodb_download(c("16", "18"), outdir = "test", unzip = FALSE)
geodb_download("16", outdir = "C:/Users/Krzysztof/Desktop/test2", unzip = TRUE)

# geonames_download
geonames_download(type = "object", format = c("GML", "XLSX"),
                  outdir = "test", unzip = FALSE)
geonames_download(type = "place", format = c("GML", "SHP"),
                  outdir = "C:/Users/Krzysztof/Desktop/test2", unzip = TRUE)

# models3D_download
models3D_download(TERYT = c("2462", "0401"), LOD = "LOD1", outdir = "test", unzip = FALSE)
models3D_download(TERYT = c("2462", "0401"), LOD = "LOD1",
                  outdir = "C:/Users/Krzysztof/Desktop/test2", unzip = TRUE)
# they are unzipped to one folder ("Modele_3D"), but have
# different archives names ("0401_gml" and "2462_gml")

# pointDTM100_download
pointDTM100_download(c("02", "16"), outdir = "test", unzip = FALSE)
pointDTM100_download(c("opolskie", "wielkopolskie"),
                     outdir = "C:/Users/Krzysztof/Desktop/test2", unzip = TRUE)

# tile_download (orto_request)
library(sf)
polygon_path = system.file("datasets/search_area.gpkg", package = "rgugik")
polygon = read_sf(polygon_path)
req_df = orto_request(polygon)

tile_download(req_df[1:2, ], check_SHA = TRUE)
tile_download(req_df[1:2, ], outdir = "test", check_SHA = TRUE)
tile_download(req_df[1:2, ], outdir = "C:/Users/Krzysztof/Desktop/test2",
              check_SHA = TRUE)

and it looks like it works fine.

kadyb commented 4 years ago

I have a few suggestions:

req_df = DEM_request(polygon)
ext = substr(req_df$URL, nchar(req_df$URL) - 3, nchar(req_df$URL))
ext
#>  [1] ".zip" ".zip" ".asc" ".zip" ".asc" ".las" ".zip" ".ttn" ".zip"
#> [10] ".zip" ".zip" ".zip" ".zip" ".asc" ".asc" ".las" ".zip" ".asc"
Nowosad commented 4 years ago

@kadyb, good points. I can fix them during the next weekend. Feel free to work on them if you have some time earlier.

kadyb commented 4 years ago

Before your update, we had such a conditional statement in tile_download:

if (!check_SHA) {
  # loop
} else {
   # loop
}

After update it is:

loop {
   if (!check_SHA)
}

There is another condition in the next update if (unzip && ext == "zip"). Don't we significantly lose performance by checking 3 conditions in every loop iteration?

Nowosad commented 4 years ago

Before your update, we had such a conditional statement in tile_download:

if (!check_SHA) {
  # loop
} else {
   # loop
}

After update it is:

loop {
   if (!check_SHA)
}

There is another condition in the next update if (unzip && ext == "zip"). Don't we significantly lose performance by checking 3 conditions in every loop iteration?

Good question. My intuition is that there will not be any significant lost of performance due to this if statement. Downloading of one file can take seconds or more. Execution of one if statement, on the other hand, is measured in nanoseconds (10-9 of a second; see e.g., https://stackoverflow.com/a/34005546/2602477).

kadyb commented 4 years ago

Test models3D_download after update and it works:

models3D_download(TERYT = c("2462", "0401"), LOD = "LOD1",
                  outdir = "C:/Users/Krzysztof/Desktop/test1", unzip = FALSE)

models3D_download(TERYT = c("2462", "0401"), LOD = "LOD1",
                  outdir = "C:/Users/Krzysztof/Desktop/test2", unzip = TRUE)

models3D_download(TERYT = c("2462", "0401"), LOD = "LOD1",
                  outdir = "test1", unzip = FALSE)

models3D_download(TERYT = c("2462", "0401"), LOD = "LOD1",
                  outdir = "test2", unzip = TRUE)
kadyb commented 4 years ago

Test tile_download after update and it works:

### orto
req_df = orto_request(polygon)
tile_download(req_df[1, ], check_SHA = TRUE)
req_df[1, "sha1"]
as.character(openssl::sha1(file("41_3756_N-33-130-D-b-2-3.tif")))

### DEM 1 (zip)
req_df = DEM_request(polygon)
req_df$sha1[1] = "XXX" # fake SHA
tile_download(req_df[1, ], outdir = "test", check_SHA = TRUE)
# should return warning

### DEM 2 (asc)
tile_download(req_df[3, ], outdir = "test", check_SHA = TRUE)
req_df[3, "sha1"]
as.character(openssl::sha1(file("test/2730_318577_N-33-130-D-b-2-3.asc")))

### DEM 3 (zip)
tile_download(req_df[7, ], outdir = "test", check_SHA = TRUE, unzip = FALSE)
req_df[7, "sha1"]
as.character(openssl::sha1(file("test/5132_384975_N-33-130-D-b-2_asc.zip")))
kadyb commented 4 years ago

@Nowosad please review code after my updates and try it on Linux. If all is OK, please merge it. Before merge regenerate docs, I didn't do it because generating polish characters doesn't work for me.

The issue of examples of using relative and absolute paths in the documentation has not been resolved. I think the examples will be too long (not minimal).