njtierney / geotargets

Targets extensions for geospatial data
https://njtierney.github.io/geotargets/
Other
49 stars 4 forks source link

Best way to use SpatRaster tiles? #69

Open Aariq opened 1 month ago

Aariq commented 1 month ago

I'm working with a large SpatRaster and running out of memory trying to do pixel-wise computations on the whole thing, so I'm thinking the best thing to do is to split it into tiles with makeTiles(), then (ideally) work on each tile with dynamic branching and re-combine the results. I'm having trouble getting this to work though, and not sure what pieces of this should be handled by geotargets. Here's what I've got so far:

library(targets)
tar_script({
    library(targets)
    library(geotargets)
    library(terra)

    make_tiles <- function(raster) {
        rast_name <- as.character(rlang::ensym(raster))
        x <- terra::rast(ncols = 2, nrows = 2) 
        ext(x) <- ext(raster)
        fs::dir_create("tiles")
        makeTiles(raster, x, filename = fs::path("tiles", fs::path_ext_set(rast_name, "tiff")), overwrite = TRUE)
    }

    list(
        tar_terra_rast(
            rast_example,
            terra::rast(system.file("ex/logo.tif", package="terra"))
        ),
        tar_target(
            rast_tiles,
            make_tiles(rast_example),
            format = "file"
        ),
        #for each tile, calculate mean for each pixel across the three layers
        tar_terra_rast(
            mean_tiles,
            mean(rast(rast_tiles)),
            pattern = map(rast_tiles)
        )
    )
})
tar_make()
#> terra 1.7.71
#> ▶ dispatched target rast_example
#> ● completed target rast_example [0.016 seconds]
#> ▶ dispatched target rast_tiles
#> ● completed target rast_tiles [0.077 seconds]
#> ▶ ended pipeline [0.294 seconds]
#> Error:
#> ! Error running targets::tar_make()
#> Error messages: targets::tar_meta(fields = error, complete_only = TRUE)
#> Debugging guide: https://books.ropensci.org/targets/debugging.html
#> How to ask for help: https://books.ropensci.org/targets/help.html
#> Last error message:
#>    Target mean_tiles tried to branch over rast_tiles, which is illegal. Patterns must only branch over explicitly 
#> declared targets in the pipeline. Stems and patterns are fine, but you cannot branch over branches or global 
#> objects. Also, if you branch over a target with format = "file", then that target must also be a pattern.
#> Last error traceback: <excluded for brevity>

tar_read(rast_tiles)
#> [1] "tiles/rast_example1.tiff" "tiles/rast_example2.tiff"
#> [3] "tiles/rast_example3.tiff" "tiles/rast_example4.tiff"

Created on 2024-05-14 with reprex v2.1.0

I'm not sure how to make rast_tiles also be a pattern here.

This is somewhat related to #53

Aariq commented 1 month ago

ah ok, I've gotten this to work by using tarchetypes::tar_files() for the rast_tiles target, but there's still a question of what role geotargets has in making this all work. Is there a need for some kind of tar_terra_tiles() function that is a target factory similar to tarchetypes::tar_files() but with a custom format supplied?

brownag commented 1 month ago

I like the idea of dedicated functions for working with tiling. Over the weekend I tried a targets wrapper around terra::vrt() which can help with managing the tiles, I could definitely see something like that, which facilitates working with dynamically branched spatraster targets

Aariq commented 1 month ago

Update: tar_files() is probably not the right approach because the upstream target (the one that does the work, in this case) is always re-run. So while this works, it means that the tile creation step always runs when the pipeline is run—definitely not idea. So it does seem like a custom geotargets function would be nice here. I also tried using vrt() for this problem, but in this case I couldn't use it because I needed to use app() on it downstream and app() doesn't work on vrts.