hypertidy / filearchy

write imagery to tiles
https://hypertidy.github.io/filearchy/
Other
4 stars 0 forks source link

wrong zoom level calculation? #1

Closed wiesehahn closed 6 months ago

wiesehahn commented 6 months ago

Thanks for the package, I tried to create tiles with the package, but I am struggling to get it working correctly. Lets say I have a raster file in a different CRS than EPSG:3857, and it is just covering a small area. Then tiles in low zoom levels should be non-existent/transparent/white or something like this, because zoom level 0 is covering the globe, correct? But it seems that the extent of the source raster is used as reference for zoom levels. Or am I doing something wrong?

(In relation to https://github.com/USDAForestService/gdalraster/issues/315)

library(filearchy)
library(gdalraster)

dsn <- system.file("extdata/storml_elev.tif", package = "gdalraster", mustWork = TRUE)

translate(dsn, "scaled.vrt", cl_arg = c("-scale", "-ot", "Byte"))
ds <- new(GDALRaster, "scaled.vrt", read_only = FALSE)

coltab <- cbind(0:255, t(col2rgb(hcl.colors(256))))
ds$setColorTable(1, coltab, "RGB")
ds$flushCache()  ## this writes the colour table to the scaled.vrt
plot_raster(ds)

tiles <- gdal_tiles(ds$getFilename())

ds$close()
vsi_unlink(ds$getFilename())

This results the following representation of tile 0/0/0 although its not covering the globe but just a tiny area 0

mdsumner commented 6 months ago

ah I see, no it will tile relative to the data you give it - it's effectively ungeoreferenced atm because we aren't also writing out the html. Is Mercator profile the one that you want? If you have a real example that would help, but if this gdalraster file is enough I'll just go with that.

I'll explore a bit more and make sure there's a way for tile 0 to be the whole globe with the data embedded locally in it. I'm actually not sure what gdal2tiles would do in this situation, but I think I've effectively created the "raster" profile and not the "mercator" or "geodetic" ones.

wiesehahn commented 6 months ago

Oh sorry, just read that there are different profiles available. So its not wrong of course, but I was assuming the default setting of gdal2tiles which is Mercator profile. I could send a file, but as I am also just testing this any file would do for me.

I didnt test for zoom level 0 but at higher zooms where the data source is not covering the entire tile in the zoom level, I think gdal2tiles embeds the downsampled data in the tile and leaves the rest blank.

zoom:13 2726 zoom:12 1363

For my use case rendering tiles for all zoom levels does not really make sense, it would be perfect to be able to define min and max zoom level instead.

mdsumner commented 6 months ago

I don't think I understood what "raster" profile meant until now, I had only tested on global inputs so it all worked. I think the example file in gdalraster is too tiny for the usual 0:20 zoom levels, so that will have to go in as a bit of handling. But, this is an excellent catch to make sure it scale right, thanks! I have a bit to do

mdsumner commented 6 months ago

yes, that was indeed my mistake - I can't do this without implementing only actual zooms that are relevant, and that means properly masking out tiles to not create - will take a bit longer.

mdsumner commented 6 months ago

Can you give me an example of a what a normal input raster will be like for your use-case? Are you looking at a county or a state? Just a bbox, crs, and resolution (or dimension) so I can do a few more checks on something representative. The storms raster is tiny, but as you can see it crosses a few tiles at zooms 10:12, and default resolution logic wouldn't generate the z=13 tiles, because the whole dataset fits inside a single tile at the resolution of zoom 12 (but I'm thinking about that).

Each label here is the zoom at the centre of a tile:

image

Also do you default expect xyz (tile y = 0 is at top) vs tms (tile y = 0 is at bottom) orientation? I'll template in the html also but not until I'm sure the tile writing is working properly.

A nice feature of this being in R is that dry_run mode will just give the scheme, so we can filter and plot and summarize the tiles that would be written, and use them in more abstract ways (including a quick slurp from the tiles by using VRT on the paths and the scheme params).

wiesehahn commented 6 months ago

Can you give me an example of a what a normal input raster will be like for your use-case? Are you looking at a county or a state? Just a bbox, crs, and resolution (or dimension) so I can do a few more checks on something representative.

Its hard to say as I have several ideas in mind, but it will be roughly from small areas with a couple of km² up to county level with ~100.000km². Resolution would be mostly in the range of 0.2 to 1m and CRS is EPSG:25832. Special in relation to e.g. global datasets such as GEBCO is, that data is often only available for parts of the XYZ tiles (e.g. along state boundaries). Also I guess I am more often only interested in tiles of high zoom levels (e.g. 14-18), which then only show up on top of a basemap when zoomed in.

Not exactly what I am working with atm but to give you some data here is some orthoimagery for a small area.

vrt_file <- here::here(tempdir(), "temp.vrt")
in_files <- c("https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325565698/2022-03-10/dop20rgb_32_556_5698_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325585698/2022-03-10/dop20rgb_32_558_5698_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325565700/2022-03-10/dop20rgb_32_556_5700_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325585700/2022-03-10/dop20rgb_32_558_5700_2_ni_2022-03-10.tif")

gdalraster::buildVRT(vrt_filename = vrt_file,
                     input_rasters = in_files)

Also do you default expect xyz (tile y = 0 is at top) vs tms (tile y = 0 is at bottom) orientation?

I dont really mind the default, until now I tried with XYZ.

mdsumner commented 6 months ago

Excellent thanks, GEBCO is filled for entire world bathymetry and topography ~500m

It's SRTM that's land based, 30m and it's in these VRTs that aren't efficient for large areas, but I patch GEBCO or another layer in for general use (it's easy to do but hosting the overviews obviously takes thought, something I'm worrking on).

I'm doing a bit of a rewrite 🙏

mdsumner commented 6 months ago

That is a great example, I have flushed a whole lot of problems out. Note that your native data zoom is 20 so there's an immense number of tiles for that (~40000). The arguments have changed a bit, use 'dry_run', 'overwrite' and 'update' to do-nothing, clobber-delete, write-in-place respectively. You can let it figure out all zooms up to native by default, or enter whatever zoom values you like (there's integer overflow issues at 23 or above or so and I haven't dealt with that)

I first find the number of zooms and tiles that would be create with dry run, then pick zoom 18 and run that in parallel. It's pretty fast. I expect to find a few more problems, but I really appreciate the motivation I have learned a lot!

  in_files <- c("https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325565698/2022-03-10/dop20rgb_32_556_5698_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325585698/2022-03-10/dop20rgb_32_558_5698_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325565700/2022-03-10/dop20rgb_32_556_5700_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325585700/2022-03-10/dop20rgb_32_558_5700_2_ni_2022-03-10.tif")

gdalraster::buildVRT(vrt_filename = tf <- tempfile(fileext = ".vrt"), 
                     input_rasters = sprintf("/vsicurl/%s", in_files))
#> 0...10...20...30...40...50...60...70...80...90...100 - done.

library(filearchy)
tiles0 <- gdal_tiles(tf, dry_run = TRUE)
tiles0 |> dplyr::filter(zoom == max(zoom))
#> # A tibble: 29,241 × 11
#>       tile tile_col tile_row  zoom   xmin   xmax   ymin   ymax  ncol  nrow crs  
#>      <dbl>    <dbl>    <dbl> <int>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>
#>  1 3.66e11   552848   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  2 3.66e11   552849   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  3 3.66e11   552850   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  4 3.66e11   552851   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  5 3.66e11   552852   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  6 3.66e11   552853   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  7 3.66e11   552854   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  8 3.66e11   552855   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  9 3.66e11   552856   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#> 10 3.66e11   552857   699536    20 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#> # ℹ 29,231 more rows
tiles0 |> dplyr::filter(zoom == 18)
#> # A tibble: 1,892 × 11
#>       tile tile_col tile_row  zoom   xmin   xmax   ymin   ymax  ncol  nrow crs  
#>      <dbl>    <dbl>    <dbl> <int>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>
#>  1 2.29e10   138212   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  2 2.29e10   138213   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  3 2.29e10   138214   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  4 2.29e10   138215   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  5 2.29e10   138216   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  6 2.29e10   138217   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  7 2.29e10   138218   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  8 2.29e10   138219   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#>  9 2.29e10   138220   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#> 10 2.29e10   138221   174884    18 1.09e6 1.09e6 6.70e6 6.70e6   256   256 EPSG…
#> # ℹ 1,882 more rows

Created on 2024-04-25 with reprex v2.0.2

Now, commit to writing the actual tiles here only at zoom 18:

#options(parallelly.fork.enable = TRUE, future.rng.onMisuse = "ignore")
#plan(multicore)
tiles <- gdal_tiles(tf, dry_run = FALSE, output_dir = "z18_20", zoom = 18)
#plan(sequential)

We can only use furrr for parallelization, but you can get the tiles and run the tile write on its own a different way. Hoping to unpick all that and make it easy.

mdsumner commented 6 months ago

here it is sitting with a bit of the tiles at zooms 14, 16, 18

image

I want to make all this stuff easy to work with and illustrate, been chipping away for ages.

mdsumner commented 6 months ago

Here's the whole workflow, generating HTML - this is in TMS mode (zero is the bottom row). I'm just slowly incrementing in to find the right interface for pushing into the HTML, leaflet only for now. There's a few tricks to keep the string templating safe from all the weird html.


leaflet_template <- r"(<!DOCTYPE html>
        <html lang="en">
          <head>
            <meta charset="utf-8">
            <meta name='viewport' content='width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no' />
            <title>%(xml_escaped_title)s</title>

            <!-- Leaflet -->
            <link rel="stylesheet" href="https://unpkg.com/leaflet@0.7.5/dist/leaflet.css" />
            <script src="https://unpkg.com/leaflet@0.7.5/dist/leaflet.js"></script>

            <style>
                body { margin:0; padding:0; }
                body, table, tr, td, th, div, h1, h2, input { font-family: "Calibri", "Trebuchet MS", "Ubuntu", Serif; font-size: 11pt; }
                #map { position:absolute; top:0; bottom:0; width:100%; } /* full size */
                .ctl {
                    padding: 2px 10px 2px 10px;
                    background: white;
                    background: rgba(255,255,255,0.9);
                    box-shadow: 0 0 15px rgba(0,0,0,0.2);
                    border-radius: 5px;
                    text-align: right;
                }
                .title {
                    font-size: 18pt;
                    font-weight: bold;
                }
                .src {
                    font-size: 10pt;
                }

            </style>

        </head>
        <body>

        <div id="map"></div>

        <script>
        /* **** Leaflet **** */

        // Base layers
        //  .. OpenStreetMap
        var osm = L.tileLayer('http://{s}.tile.osm.org/{z}/{x}/{y}.png', {attribution: '&copy; <a href="http://osm.org/copyright">OpenStreetMap</a> contributors', minZoom: %(minzoom)s, maxZoom: %(maxzoom)s});

        //  .. CartoDB Positron
        var cartodb = L.tileLayer('http://{s}.basemaps.cartocdn.com/light_all/{z}/{x}/{y}.png', {attribution: '&copy; <a href="http://www.openstreetmap.org/copyright">OpenStreetMap</a> contributors, &copy; <a href="http://cartodb.com/attributions">CartoDB</a>', minZoom: %(minzoom)s, maxZoom: %(maxzoom)s});

        //  .. OSM Toner
        var toner = L.tileLayer('http://{s}.tile.stamen.com/toner/{z}/{x}/{y}.png', {attribution: 'Map tiles by <a href="http://stamen.com">Stamen Design</a>, under <a href="http://creativecommons.org/licenses/by/3.0">CC BY 3.0</a>. Data by <a href="http://openstreetmap.org">OpenStreetMap</a>, under <a href="http://www.openstreetmap.org/copyright">ODbL</a>.', minZoom: %(minzoom)s, maxZoom: %(maxzoom)s});

        //  .. White background
        var white = L.tileLayer("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAQAAAAEAAQMAAABmvDolAAAAA1BMVEX///+nxBvIAAAAH0lEQVQYGe3BAQ0AAADCIPunfg43YAAAAAAAAAAA5wIhAAAB9aK9BAAAAABJRU5ErkJggg==", {minZoom: %(minzoom)s, maxZoom: %(maxzoom)s});

        // Overlay layers (TMS)
        var lyr = L.tileLayer('./{z}/{x}/{y}.%(tileformat)s', {tms: %(tms)s, opacity: 0.7, attribution: "%(copyright)s", minZoom: %(minzoom)s, maxZoom: %(maxzoom)s});

        // Map
        var map = L.map('map', {
            center: [%(centerlon)s, %(centerlat)s],
            zoom: %(beginzoom)s,
            minZoom: %(minzoom)s,
            maxZoom: %(maxzoom)s,
            layers: [osm]
        });

        var basemaps = {"OpenStreetMap": osm, "CartoDB Positron": cartodb, "Stamen Toner": toner, "Without background": white}
        var overlaymaps = {"Layer": lyr}

        // Title
        var title = L.control();
        title.onAdd = function(map) {
            this._div = L.DomUtil.create('div', 'ctl title');
            this.update();
            return this._div;
        };
        title.update = function(props) {
            this._div.innerHTML = "%(double_quote_escaped_title)s";
        };
        title.addTo(map);

        // Note
        var src = 'Generated by <a href="https://gdal.org/programs/gdal2tiles.html">GDAL2Tiles</a>, Copyright &copy; 2008 <a href="http://www.klokan.cz/">Klokan Petr Pridal</a>,  <a href="https://gdal.org">GDAL</a> &amp; <a href="http://www.osgeo.org/">OSGeo</a> <a href="http://code.google.com/soc/">GSoC</a>';
        var title = L.control({position: 'bottomleft'});
        title.onAdd = function(map) {
            this._div = L.DomUtil.create('div', 'ctl src');
            this.update();
            return this._div;
        };
        title.update = function(props) {
            this._div.innerHTML = src;
        };
        title.addTo(map);

        // Add base layers
        L.control.layers(basemaps, overlaymaps, {collapsed: false}).addTo(map);

        // Fit to overlay bounds (SW and NE points with (lat, lon))
        map.fitBounds([[%(south)s, %(east)s], [%(north)s, %(west)s]]);

        </script>

        </body>
        </html>

)"
#llex <- c(-85.05112877764508, 179.9999999749438, 85.05112877764508, -179.9999999749438)[c(4, 2, 1, 3)]
#llex <- c(9.80554836708451, 9.86375908266854, 51.4300214435954, 51.4663945832837)

in_files <- c("https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325565698/2022-03-10/dop20rgb_32_556_5698_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325585698/2022-03-10/dop20rgb_32_558_5698_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325565700/2022-03-10/dop20rgb_32_556_5700_2_ni_2022-03-10.tif",
              "https://dop20-rgb.opengeodata.lgln.niedersachsen.de/325585700/2022-03-10/dop20rgb_32_558_5700_2_ni_2022-03-10.tif")

gdalraster::buildVRT(vrt_filename = tf <- tempfile(fileext = ".vrt"),
                     input_rasters = sprintf("/vsicurl/%s", in_files))

## use the source to determine our longlat extent for the HTML
info <- vapour::vapour_raster_info(tf)
llex <- reproj::reproj_extent(info$extent, "EPSG:4326", source = info$projection)

zooms <- c(12:19)
tosubs <- read.table(text = "
%\\(xml_escaped_title\\)s
%\\(minzoom\\)s
%\\(maxzoom\\)s
%\\(tileformat\\)s
%\\(tms\\)s
%\\(copyright\\)s
%\\(centerlon\\)s
%\\(centerlat\\)s
%\\(beginzoom\\)s
%\\(double_quote_escaped_title\\)s
%\\(south\\)s
%\\(east\\)s
%\\(north\\)s
%\\(west\\)s")[[1]]

title <- "my tiled raster"
minzoom <- min(zooms)
maxzoom <- max(zooms)
tileformat <- "png"
tms <- 1   ## or xyz, 0 presumably
copyright <- "our stuff"
centerlon <- sprintf("%01f", mean(llex[1:2]))

centerlat <- sprintf("%01f", mean(llex[3:4]))
beginzoom <- max(zooms)
double_quote_escaped_title <- "title"
south <- llex[3]
east <- llex[2]
north <- llex[4]
west <- llex[1]

values <- c(title, minzoom, maxzoom, tileformat, tms, copyright, centerlon, centerlat,
            beginzoom, double_quote_escaped_title,
            south, east, north, west)

for (i in seq_along(tosubs)) {
  leaflet_template <- gsub(tosubs[i], values[i], leaflet_template)
}

odir <- "tiles_12_19"

options(parallelly.fork.enable = TRUE, future.rng.onMisuse = "ignore")
library(furrr)
library(filearchy)
plan(multicore)
tiles <- gdal_tiles(tf, dry_run = FALSE, output_dir = odir, zoom = zooms)
plan(sequential)
writeLines(leaflet_template, file.path(odir, "my.html"))

That generates 800Mb of tiles, but pretty quickly with the parallelization. I'm abusing parallelly options because I want to do it in RStudio for now.

mdsumner commented 6 months ago

I went to zoom 20 as well, 200 seconds on 32 cores, ends up as 1.5Gb of 40000 png.

mdsumner commented 6 months ago

This is working now, please submit new issues as needed.