r-spatial / stars

Spatiotemporal Arrays, Raster and Vector Data Cubes
https://r-spatial.github.io/stars/
Apache License 2.0
563 stars 94 forks source link

Bulk update to support CFtime as time management (issue #674) #690

Closed pvanlaake closed 1 month ago

pvanlaake commented 5 months ago

This is a PR for integrating CFtime into stars to manage "time" dimensions of NetCDF files (and any others that provide the required metadata). This is a very comprehensive "patch" over >30 files and I am happy to discuss the details if so desired.

One of the reasons why the job got a little out of hand is that CFtime works fundamentally different from the way that other dimensions are treated in stars. In short, stars has a reference to a CFtime instance which holds the actual dimension data. A call like expand_dimensions() will return a reference to that instance rather than the actual dimension values. This necessitated additional conditional evaluation in several higher-level functions, of the kind if (methods::is(x, "CFtime") .... I have not fudged any tests or examples and wherever a test or an example referenced a "time" dimension the code works as expected.

I made CFtime an "Imports" because it is referenced basically throughout stars. (And it is the "t" in stars, after all, if the PR is accepted.)

In the end it all works fine and no errors are reported on clean() (there are plenty of messages on discrepancies of output compared to saved output in tests, but none of that seems to be related to the patch). There is a warning related to cubble, I do not know how severe this is:

   < ── Warning ('test-cubble.R:28:3'): cubble ──────────────────────────────────────
   < st_centroid assumes attributes are constant over geometries

Packages ncmeta and CFtime have both had recent upgrades and are on CRAN. This PR was synched with the main branch on 5 June so it is up-to-date as of this writing.

edzer commented 3 months ago

Here is the section of the test diffs that I get when running R CMD check stars_...:

* checking tests ...
  Running ‘aggregate.R’
  Comparing ‘aggregate.Rout’ to ‘aggregate.Rout.save’ ...139c139
< time    1  6     NA    NA     NA    NA 50,...,55    
---
> geometry    1  6     NA    NA     NA    NA 50,...,55    
  Running ‘align.R’
  Comparing ‘align.Rout’ to ‘align.Rout.save’ ... OK
  Running ‘area.R’
  Comparing ‘area.Rout’ to ‘area.Rout.save’ ... OK
  Running ‘crop.R’
  Comparing ‘crop.Rout’ to ‘crop.Rout.save’ ... OK
  Running ‘curvilinear.R’
  Comparing ‘curvilinear.Rout’ to ‘curvilinear.Rout.save’ ...68c68
< time    1   1 269481600     NA                               NULL    
---
> time    1   1 2018-07-17 UTC POSIXct                               NULL    
  Running ‘datasets.R’
  Comparing ‘datasets.Rout’ to ‘datasets.Rout.save’ ... OK
  Running ‘dimensions.R’
  Comparing ‘dimensions.Rout’ to ‘dimensions.Rout.save’ ...10,11d9
< > offsets <- CFtime::offsets(timeline)
< > 
15c13
< +     idx = offsets
---
> +     idx = st_get_dimension_values(raw, "time")
20c18
< +     values = timeline,
---
> +     values = st_get_dimension_values(raw, "time"),
29,35c27,30
<      from to offset  delta                    values x/y
< time    1 12     NA     NA 1999-01-31,...,1999-12-31    
< x       1 81    -85  0.125                      NULL [x]
< y       1 33  37.12 -0.125                      NULL [y]
< Warning message:
< In `[<-`(`*tmp*`, "values", value = values) :
<   implicit list embedding of S4 objects is deprecated
---
>      from to offset  delta  refsys                    values x/y
> time    1 12     NA     NA POSIXct 1999-01-31,...,1999-12-31    
> x       1 81    -85  0.125      NA                      NULL [x]
> y       1 33  37.12 -0.125      NA                      NULL [y]
38,39c33,34
< +   st_apply(MARGIN = c("x", "y"), FUN = foo, idx = offsets) %>%
< +   st_set_dimensions("foo", timeline, names = "time")
---
> +   st_apply(MARGIN = c("x", "y"), FUN = foo, idx = timeline) %>%
> +   st_set_dimensions("foo", st_dimensions(raw)["time"])
46,52c41,44
<      from to offset  delta                    values x/y
< time    1 12     NA     NA 1999-01-31,...,1999-12-31    
< x       1 81    -85  0.125                      NULL [x]
< y       1 33  37.12 -0.125                      NULL [y]
< Warning message:
< In `[<-`(`*tmp*`, "values", value = values) :
<   implicit list embedding of S4 objects is deprecated
---
>      from to offset  delta  refsys                    values x/y
> time    1 12     NA     NA POSIXct 1999-01-31,...,1999-12-31    
> x       1 81    -85  0.125      NA                      NULL [x]
> y       1 33  37.12 -0.125      NA                      NULL [y]
  Running ‘downsample.R’
  Comparing ‘downsample.Rout’ to ‘downsample.Rout.save’ ... OK
  Running ‘ee.R’
  Comparing ‘ee.Rout’ to ‘ee.Rout.save’ ... OK
  Running ‘extract.R’
  Comparing ‘extract.Rout’ to ‘extract.Rout.save’ ... OK
  Running ‘gridtypes.R’
  Comparing ‘gridtypes.Rout’ to ‘gridtypes.Rout.save’ ... OK
  Running ‘mdim.R’
  Comparing ‘mdim.Rout’ to ‘mdim.Rout.save’ ...33,41c33,34
<  [5] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": 'current' is not a POSIXt >"                                                
<  [6] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < Modes: list, NULL > >"                                         
<  [7] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < Lengths: 2, 0 > >"                                             
<  [8] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < names for target but not for current > >"                      
<  [9] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < current is not list-like > >"                                  
< [10] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": target is difftime, current is numeric >"                                    
< [11] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"refsys\": 1 string mismatch >"                                                        
< [12] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"point\": 'is.NA' value mismatch: 1 in current 0 in target >"                          
< [13] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"values\": target is NULL, current is CFtime >"                                        
---
> [5] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": 'tzone' attributes are inconsistent ('' and 'UTC') >"                       
> [6] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"point\": 'is.NA' value mismatch: 1 in current 0 in target >"                          
80,82c73,75
<    from to refsys point                    values
< X1    1  2 WGS 84  TRUE  POINT (0 1), POINT (3 2)
< X2    1  2 CFtime    NA 2023-02-25,...,2023-02-26
---
>    from to     offset  delta refsys point                   values
> X1    1  2         NA     NA WGS 84  TRUE POINT (0 1), POINT (3 2)
> X2    1  2 2023-02-25 1 days   Date    NA                     NULL
86,98c79
<  [1] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": Attributes: < Modes: list, NULL > >"                   
<  [2] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": Attributes: < Lengths: 1, 0 > >"                       
<  [3] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": Attributes: < names for target but not for current > >"
<  [4] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": Attributes: < current is not list-like > >"            
<  [5] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"offset\": target is Date, current is numeric >"                  
<  [6] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < Modes: list, NULL > >"                    
<  [7] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < Lengths: 2, 0 > >"                        
<  [8] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < names for target but not for current > >" 
<  [9] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": Attributes: < current is not list-like > >"             
< [10] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"delta\": target is difftime, current is numeric >"               
< [11] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"refsys\": 1 string mismatch >"                                   
< [12] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"point\": 'is.NA' value mismatch: 1 in current 0 in target >"     
< [13] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"values\": target is NULL, current is CFtime >"                   
---
> [1] "Attributes: < Component \"dimensions\": Component \"X2\": Component \"point\": 'is.NA' value mismatch: 1 in current 0 in target >"

In particular, there are instances where the refsys field is lost (but also some where it is set to CFtime), curvilinear.R gives a strange time output, there is a case where offset and delta got lost, and there's the frequent implicit list embedding of S4 objects is deprecated warning.

Are these issues that you will look into?