r-spatial / stars

Spatiotemporal Arrays, Raster and Vector Data Cubes
https://r-spatial.github.io/stars/
Apache License 2.0
560 stars 94 forks source link

match climatic layers with their temporal dimension #713

Open rociobeatrizc opened 1 week ago

rociobeatrizc commented 1 week ago

Hello from Bologna, Italy! This is the first Issue I open on GitHub :)

I've been trying to solve a problem for days. I downloaded and imported 4 bioclimatic variables from CHELSA for 3 different months: January, February, and March. I created three stacks, one for each month, containing the variables. Here is how the files look:

> jan_stack
class       : SpatRaster 
dimensions  : 146, 212, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : 13.01653, 14.78319, 41.68319, 42.89986  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : CHELSA_bio_01_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_07_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_13_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_14_1981-2010_V2.1_clipped.tif  
names       : CHELSA_~10_V2.1, CHELSA_~10_V2.1, CHELSA_~10_V2.1, CHELSA_~10_V2.1 
min values  :           -0.95,            16.2,            59.8,            18.9 
max values  :           16.65,            29.5,           241.7,            92.8 

> feb_stack
class       : SpatRaster 
dimensions  : 146, 212, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : 13.01653, 14.78319, 41.68319, 42.89986  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : CHELSA_bio_01_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_07_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_13_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_14_1981-2010_V2.1_clipped.tif  
names       : CHELSA_~10_V2.1, CHELSA_~10_V2.1, CHELSA_~10_V2.1, CHELSA_~10_V2.1 
min values  :           -0.95,            16.2,            59.8,            18.9 
max values  :           16.65,            29.5,           241.7,            92.8 

> mar_stack
class       : SpatRaster 
dimensions  : 146, 212, 4  (nrow, ncol, nlyr)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : 13.01653, 14.78319, 41.68319, 42.89986  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326) 
sources     : CHELSA_bio_01_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_07_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_13_1981-2010_V2.1_clipped.tif  
              CHELSA_bio_14_1981-2010_V2.1_clipped.tif  
names       : CHELSA_~10_V2.1, CHELSA_~10_V2.1, CHELSA_~10_V2.1, CHELSA_~10_V2.1 
min values  :           -0.95,            16.2,            59.8,            18.9 
max values  :           16.65,            29.5,           241.7,            92.

My goal is to create data cubes, so I need to add the temporal dimension. In this case, since these are climate series, each layer represents a monthly average over a 30-year period. Therefore, I thought of simply assigning a representative date of the month to each stack.

But if I proceed to create a 'stars' object, I get an object with a single attribute and three dimensions.

> jan_stars <- st_as_stars(jan_stack)
> jan_stars
stars object with 3 dimensions and 1 attribute
attribute(s):
                                 Min. 1st Qu. Median     Mean 3rd Qu.  Max.
CHELSA_bio_01_1981-2010_V2....  -0.95   16.55   28.1 50.30252    71.1 241.7
dimension(s):
           from  to offset     delta refsys point                                                        values x/y
x             1 212  13.02  0.008333 WGS 84 FALSE                                                          NULL [x]
y             1 146   42.9 -0.008333 WGS 84 FALSE                                                          NULL [y]
attributes    1   4     NA        NA     NA    NA CHELSA_bio_01_1981-2010_V2.1,...,CHELSA_bio_14_1981-2010_V2.1

'split' option

At this point, I tried to convert the 'attributes' dimension into separate attributes using 'split', but I lost one dimension and couldn't find any way to add it back while keeping the four attributes in place.

> split(jan_stars)
stars object with 2 dimensions and 4 attributes
attribute(s):
                               Min. 1st Qu. Median      Mean 3rd Qu.   Max.
CHELSA_bio_01_1981-2010_V2.1  -0.95    9.35  12.35  12.03118   15.45  16.65
CHELSA_bio_07_1981-2010_V2.1  16.20   25.30  27.20  25.69690   27.80  29.50
CHELSA_bio_13_1981-2010_V2.1  59.80   84.90 111.20 119.10602  147.40 241.70
CHELSA_bio_14_1981-2010_V2.1  18.90   33.20  43.90  44.37600   53.60  92.80
dimension(s):
  from  to offset     delta refsys point x/y
x    1 212  13.02  0.008333 WGS 84 FALSE [x]
y    1 146   42.9 -0.008333 WGS 84 FALSE [y]

manually add date

However, if I manually add the date, I'm left with only one attribute and I can no longer recover the other three. I need to keep this information as it will be needed for creating an SDM.

# Random January date
my_date <- as.Date("2009-01-01")

# Since I have 4 bands, I repeated the same date
repeated_dates <- rep(my_date, 4)

jan_time <- stars::st_set_dimensions(jan_stars,
      3,
      values = repeated_dates,
      names = "date")
jan_time

stars object with 3 dimensions and 1 attribute
attribute(s):
                                 Min. 1st Qu. Median     Mean 3rd Qu.  Max.
CHELSA_bio_01_1981-2010_V2....  -0.95   16.55   28.1 50.30252    71.1 241.7
dimension(s):
     from  to offset     delta refsys point                    values x/y
x       1 212  13.02  0.008333 WGS 84 FALSE                      NULL [x]
y       1 146   42.9 -0.008333 WGS 84 FALSE                      NULL [y]
date    1   4     NA        NA   Date    NA 2009-01-01,...,2009-01-01 

I have the impression that the solution is simple, but I just can't seem to find it. I really hope for some help. Thank you so much in advance!

edzer commented 1 week ago

Since you closed the issue, did you solve it?

rociobeatrizc commented 1 week ago

Good evening. Yes, it was a very silly problem. I couldn't set the time dimension correctly, but it was just a matter of adding the dates. The initial situation, when I converted the raster stack into a stars object, was the following:

> jan_stars
stars object with 3 dimensions and 1 attribute
attribute(s):
                                 Min. 1st Qu. Median     Mean 3rd Qu.  Max.
CHELSA_bio_01_1981-2010_V2....  -0.95   16.55   28.1 50.30252    71.1 241.7
dimension(s):
           from  to offset     delta refsys point                                                        values x/y
x             1 212  13.02  0.008333 WGS 84 FALSE                                                          NULL [x]
y             1 146   42.9 -0.008333 WGS 84 FALSE                                                          NULL [y]
attributes    1   4     NA        NA     NA    NA CHELSA_bio_01_1981-2010_V2.1,...,CHELSA_bio_14_1981-2010_V2.1 

To add the temporal dimension (the same date for each attribute, which are 4) I followed the following steps, which are probably a bit 'rough' but work

# list with the stars object repeated 4 times
l = vector("list", 4)
l[] = list(jan_stars)

# temporary names
names(l) = c("w", "x", "y", "z")

# apply the list of names will result in another dimension: now we have 4 dimensions and 1 attribute
r = do.call("c", l)                                                                                                        
r = st_redimension(r)
> r
stars object with 4 dimensions and 1 attribute
attribute(s):
          Min. 1st Qu. Median     Mean 3rd Qu.  Max.
w.x.y.z  -0.95   16.55   28.1 50.30252    71.1 241.7
dimension(s):
           from  to offset     delta refsys point                                                        values x/y
x             1 212  13.02  0.008333 WGS 84 FALSE                                                          NULL [x]
y             1 146   42.9 -0.008333 WGS 84 FALSE                                                          NULL [y]
attributes    1   4     NA        NA     NA    NA CHELSA_bio_01_1981-2010_V2.1,...,CHELSA_bio_14_1981-2010_V2.1    
new_dim       1   4     NA        NA     NA    NA                                                       w,...,z    

# with split, the third dimension, which actually contains the bioclimatic variables, is turned into attributes
> r = split(r, 3)
> r
stars object with 3 dimensions and 4 attributes
attribute(s):
                               Min. 1st Qu. Median      Mean 3rd Qu.   Max.
CHELSA_bio_01_1981-2010_V2.1  -0.95    9.35  12.35  12.03118   15.45  16.65
CHELSA_bio_07_1981-2010_V2.1  16.20   25.30  27.20  25.69690   27.80  29.50
CHELSA_bio_13_1981-2010_V2.1  59.80   84.90 111.20 119.10602  147.40 241.70
CHELSA_bio_14_1981-2010_V2.1  18.90   33.20  43.90  44.37600   53.60  92.80
dimension(s):
        from  to offset     delta refsys point  values x/y
x          1 212  13.02  0.008333 WGS 84 FALSE    NULL [x]
y          1 146   42.9 -0.008333 WGS 84 FALSE    NULL [y]
new_dim    1   4     NA        NA     NA    NA w,...,z    

# Finally, let's replace the new_dim with the temporal dimension                                                                                                                                                                                               
r <- st_set_dimensions(.x = r, which = "new_dim", values = as.Date(c("2009-01-01", "2009-01-01", "2009-01-01","2009-01-01")), names = "time")
> r
stars object with 3 dimensions and 4 attributes
attribute(s):
                               Min. 1st Qu. Median      Mean 3rd Qu.   Max.
CHELSA_bio_01_1981-2010_V2.1  -0.95    9.35  12.35  12.03118   15.45  16.65
CHELSA_bio_07_1981-2010_V2.1  16.20   25.30  27.20  25.69690   27.80  29.50
CHELSA_bio_13_1981-2010_V2.1  59.80   84.90 111.20 119.10602  147.40 241.70
CHELSA_bio_14_1981-2010_V2.1  18.90   33.20  43.90  44.37600   53.60  92.80
dimension(s):
     from  to offset     delta refsys point                    values x/y
x       1 212  13.02  0.008333 WGS 84 FALSE                      NULL [x]
y       1 146   42.9 -0.008333 WGS 84 FALSE                      NULL [y]
time    1   4     NA        NA   Date    NA 2009-01-01,...,2009-01-01

I apologize if I improperly opened an issue on GitHub with this problem!

edzer commented 1 week ago

Thanks for the clarification!

rociobeatrizc commented 1 week ago

Good evening, I'm reopening the issue because, although the method I wrote above seems to work formally, when I try to select a date, I cannot retrieve the corresponding image. So, I think I might have incorrectly set the temporal dimension. I'll start from the beginning:

I used this function to download three climatic variables (precipitation, temperature, and maximum temperature): for each of these variables, I downloaded the first 4 months of the year.

# devtools required
if(!require(devtools)) install.packages("devtools")
library(devtools)
devtools::install_github("HelgeJentsch/ClimDatDownloadR")
library(ClimDatDownloadR)

# shapefile and bounding box
aoi_abruzzo <- st_read("abruzzo.shp") %>% .$geometry 

# Bounding Box 
abruzzo_bb <- st_bbox(aoi_abruzzo)

WorldClim.HistClim.download(
  # save.location = "./",
  parameter = c("prec", "temp", "tmax"),

  # 4 months, from jan to april
  month.var = c(1:4),

  # here, 10 arc-minutes are chosen as resolution 
  resolution = c("10m"),
  version.var = c("2.1"),
  clipping = TRUE,
  clip.shapefile = aoi_abruzzo,
  clip.extent = abruzzo_bb,
  # buffer = 0,
  convert.files.to.asc = FALSE,
  stacking.data = TRUE,
  keep.raw.zip = FALSE,
  delete.raw.data = FALSE,
  save.bib.file = TRUE
)

As a result, I have 4 .tif files for each variable, each representing a month. This is, for example, the precipitation variable imported as a stack:

# Import raster
# String containing the names of raster files
rastlist <- list.files(path ="my/path/prec/WorldClim_2.1_prec_30s/clipped_2024-10-09_16-35-06", pattern = "wc2.1", full.names = TRUE)

# Using the list of names, all the files are imported into a single raster package
stack_prec <- stack(rastlist)
> stack_prec
class      : RasterStack 
dimensions : 145, 212, 30740, 4  (nrow, ncol, ncell, nlayers)
resolution : 0.008333333, 0.008333333  (x, y)
extent     : 13.01667, 14.78333, 41.68333, 42.89167  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs 
names      : wc2.1_30s_prec_01, wc2.1_30s_prec_02, wc2.1_30s_prec_03, wc2.1_30s_prec_04 
min values :                37,                34,                35,                32 
max values :               102,                99,                84,               106 

I have 4 layers, each corresponding to a month. I would like to create a stars object where, by adding and associating the temporal dimension to each layer, I am able to extract the precipitation corresponding to that month using the date.

These are the steps I followed

# stars object
stars_prec <- st_as_stars(stack_prec)

> stars_prec
stars object with 3 dimensions and 1 attribute
attribute(s):
                   Min. 1st Qu. Median     Mean 3rd Qu. Max.  NA's
wc2.1_30s_prec_01    32      48     53 55.19539      60  106 24480
dimension(s):
     from  to offset     delta                       refsys                                  values x/y
x       1 212  13.02  0.008333 +proj=longlat +datum=WGS8...                                    NULL [x]
y       1 145  42.89 -0.008333 +proj=longlat +datum=WGS8...                                    NULL [y]
band    1   4     NA        NA                          NA wc2.1_30s_prec_01,...,wc2.1_30s_prec_04 

Already at this point, I can see that the problem is that it only considers the first month of precipitation (wc2.1_30s_prec_01) as an 'attribute' and does not include all 4 of them.

To add the temporal dimension, I tried the following, based on an answer here on Stack Overflow.

# add temporal dimension 
l = vector("list", 4)
l[] = list(stars_prec)
r = do.call("c", l) 
r = st_redimension(r)
# 4 random days of the 4 months
r <- st_set_dimensions(.x = r, which = "new_dim", values = as.Date(c("1980-01-01", "1980-02-01", "1980-03-01","1980-04-01")), names = "time") 
> r
stars object with 4 dimensions and 1 attribute
attribute(s):
                                Min. 1st Qu. Median     Mean 3rd Qu. Max.  NA's
wc2.1_30s_prec_01.wc2.1_30s...    32      48     53 55.19539      60  106 97920
dimension(s):
     from  to offset     delta                       refsys                                  values x/y
x       1 212  13.02  0.008333 +proj=longlat +datum=WGS8...                                    NULL [x]
y       1 145  42.89 -0.008333 +proj=longlat +datum=WGS8...                                    NULL [y]
band    1   4     NA        NA                           NA wc2.1_30s_prec_01,...,wc2.1_30s_prec_04    
time    1   4     NA        NA                         Date               1980-01-01,...,1980-04-01 

I set the temporal dimension, but the attribute still only includes the precipitation values for the first month. If I apply the split function on the third dimension, I get this:

> split(r,3)
stars object with 3 dimensions and 4 attributes
attribute(s):
                   Min. 1st Qu. Median     Mean 3rd Qu. Max.  NA's
wc2.1_30s_prec_01    37      44     48 49.30158      51  102 24480
wc2.1_30s_prec_02    34      49     52 53.25581      56   99 24480
wc2.1_30s_prec_03    35      50     53 53.35853      56   84 24480
wc2.1_30s_prec_04    32      61     66 64.86564      70  106 24480
dimension(s):
     from  to offset     delta                       refsys                    values x/y
x       1 212  13.02  0.008333 +proj=longlat +datum=WGS8...                      NULL [x]
y       1 145  42.89 -0.008333 +proj=longlat +datum=WGS8...                      NULL [y]
time    1   4     NA        NA                         Date 1980-01-01,...,1980-04-01  

Each month becomes a separate attribute! I don't know how to solve this problem; I feel like the solution is simple, but I've been stuck for days. I would like to obtain the classic data cube, like this. Many thanks in advance.