rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
543 stars 90 forks source link

terra::rotate is repeating layers above #32768 #1562

Closed sruzzante closed 4 months ago

sruzzante commented 4 months ago

When terra::rotate is used on a large raster stack, sometimes the layers >32768 are incorrect in the resulting raster stack. Layer 32769 is actually layer 1 from the original raster (rotated), layer 32770 is layer 2, etc.

I was rotating 115 years of daily climate data (1900-2014), and I noticed that the rotated raster layer for 1989-09-19 was identical to 1900-01-01. This suggests that the layer index is getting stored as a 16-bit integer (-32,768 to 32,767) somewhere, resulting in the looping behaviour.

This doesn't happen with all data files. I found it happens with relatively fine resolution data (~0.7 degrees) but not with coarser resolution data (~2 degrees).

This strongly suggests that the layer index is getting stored as a 16-bit integer somewhere, resulting in the looping behaviour.

The data can be downloaded here: https://aims2.llnl.gov/search/cmip6/?project=CMIP6&activeFacets=%7B%22activity_id%22%3A%22CMIP%22%2C%22source_id%22%3A%22EC-Earth3%22%2C%22experiment_id%22%3A%22historical%22%2C%22nominal_resolution%22%3A%22100+km%22%2C%22variant_label%22%3A%22r4i1p1f1%22%2C%22grid_label%22%3A%22gr%22%2C%22table_id%22%3A%22day%22%2C%22variable_id%22%3A%22tas%22%2C%22data_node%22%3A%22esgf-data1.llnl.gov%22%7D

You only need the 1900 and 1989 files to make the MWE work. Note that this used 78.23 GB of RAM for me. I couldn't reproduce the behaviour with smaller files.

fls<-c(
  rep("tas_day_EC-Earth3_historical_r4i1p1f1_gr_19000101-19001231.nc",89),
  "tas_day_EC-Earth3_historical_r4i1p1f1_gr_19890101-19891231.nc"
) # load the 1900 file 89 times and the 1989 file once

xRast<-terra::rast(fls)

xRast.rotate<-terra::rotate(xRast)

xDiff <- xRast.rotate[[32769]]-xRast.rotate[[1]] # take difference of layer #32769 and layer #1
xDiff # see that xDiff is all zero
#> class       : SpatRaster 
#> dimensions  : 256, 512, 1  (nrow, ncol, nlyr)
#> resolution  : 0.703125, 0.7016692  (x, y)
#> extent      : -180.3516, 179.6484, -89.81366, 89.81366  (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326) 
#> source(s)   : memory
#> varname     : rotatedRast 
#> name        : tas_284 
#> min value   :       0 
#> max value   :       0 
#> time (days) : 1989-10-11 
rhijmans commented 4 months ago

I changed the unsigned variables (that should go up to at least 4294967295) in "rotate", to size_t (that can hold a much larger number on 64-bit systems). I have not tested it but I think that should fix thsi.

I think that using unsigned was not enough here, because, at 32768 layers, the offset of the values is computed as 32768 * 256 * 512 = 2^32 = 4294967296. I think it is safe to assume that size_t can hold a larger number on your system,

This explains why this does not happen on lower spatial resolution files. For example, for 115 years daily data at 1 degree spatial resolution you get (115 * 365.25 * 180 * 360) < 4294967295

The higher res data may sometimes work; when "terra" processes the file in chunks because RAM is deemed insufficient to load the entire dataset into it. You can force that to happen by setting a lower max RAM usage with terraOptions

sruzzante commented 4 months ago

Thanks for the speedy fix! This does seem to have fixed the issue.