EvolEcolGroup / tidysdm

R package to fit species distribution models (SDMs) using the 'tidymodels' framework
https://evolecolgroup.github.io/tidysdm/
GNU Affero General Public License v3.0
28 stars 8 forks source link

issue with "sample_pseudoabs_time" #60

Closed zpmdal closed 4 weeks ago

zpmdal commented 4 weeks ago

This is not really about bug, but about data format issue. I don't have problem following the example of "Application with palaeodata". But when I suppled my own data with similar format I ran into the error below. I guess the problem is with my raster = SSTSpatRastDataset.

When I print the object print(SSTSpatRastDataset), I got the following output: class : SpatRasterDataset subdatasets : 4 dimensions : 1104, 2203 (nrow, ncol) nlyr : 4, 4, 4, 4 resolution : 0.01, 0.01 (x, y) extent : -73.025, -50.995, 40.975, 52.015 (xmin, xmax, ymin, ymax) coord. ref. : lon/lat WGS 84 (EPSG:4326) source(s) : crop_Spr20_av20i.tif, crop_Spr21_av21i.tif, crop_Spr22_av22i.tif, crop_Spr23_av23i.tif, crop_sss_m_2020.tif, crop_sss_m_2021.tif, ... names : SprSST, sssM, yrMaxSST, angws

I have 4 time points: time(SSTSpatRastDataset):

time(SSTSpatRastDataset) $SprSST [1] 2020 2021 2022 2023

$sssM [1] 2000 2021 2022 2023

$yrMaxSST [1] 2020 2021 2022 2023

$angws [1] 2020 2021 2022 2023

When I did the same for the example raster (climate_full), I got the following: class : SpatRasterDataset subdatasets : 3 dimensions : 49, 87 (nrow, ncol) nlyr : 5, 5, 5 resolution : 1, 1 (x, y) extent : -27, 60, 33, 82 (xmin, xmax, ymin, ymax) coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) source(s) : memory names : bio01, bio10, bio12

and time(climate_full): $bio01 [1] -18050 -13050 -8050 -3050 1950

$bio10 [1] -18050 -13050 -8050 -3050 1950

$bio12 [1] -18050 -13050 -8050 -3050 1950

So, my question is what is the correct way to build the SpatRasterDataset? Here is how I built the SpatRasterDataset: var1_2020 = rast(var1_2020.tif); time(var1_2020)=2020 var1_2021 = rast(var1_2021.tif); time(var1_2021) = 2021 ... var1 = c(var1_2020, va1_2021...var1_2023) var2 = c(var2_2020, var2_2021...var2_2023)

SSTsRastDataset = sds(var1, var2...)

Ascidiella_aspersa <- sample_pseudoabs_time(Ascidiella_aspersa,
+                                             n_per_presence = 3,
+                                             raster = SSTSpatRastDataset,
+                                             time_col = "Year",
+                                             lubridate_fun = pastclim::ybp2date,
+                                             method = c("dist_min", km2m(5)))
Error in Math.POSIXt(a - b) : 'abs' not defined for "POSIXt" objects
In addition: Warning message:
In out_of_range_warning(time_lub, time_steps) :
  Some dates are out of the range of the available time series.
They will be assigned to the most extreme time point available, but this
might not make sense. The potentially problematic dates are:
3970-01-013970-01-013970-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013970-01-013970-01-013970-01-013970-01-013970-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013971-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013972-01-013973-01-01
# insert the output of `utils::sessionInfo()`
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Halifax
tzcode source: internal

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.3      forcats_1.0.0        stringr_1.5.1       
 [4] tidyverse_2.0.0      raster_3.6-30        sp_2.1-4            
 [7] randomForest_4.7-1.2 earth_5.3.4          plotmo_3.6.4        
[10] plotrix_3.8-4        Formula_1.2-5        gbm_2.2.2           
[13] gam_1.22-5           foreach_1.5.2        mda_0.5-4           
[16] class_7.3-22         rpart_4.1.23         nnet_7.3-19         
[19] biomod2_4.2-6-1      mgcv_1.9-1           nlme_3.1-164        
[22] ranger_0.16.0        maxnet_0.1.4         xgboost_1.7.8.1     
[25] readr_2.1.5          rgbif_3.8.1          tidyterra_0.6.1     
[28] pastclim_2.1.0       terra_1.7-83         sf_1.0-18           
[31] tidysdm_0.9.6.9003   spatialsample_0.6.0  yardstick_1.3.1     
[34] workflowsets_1.1.0   workflows_1.1.4      tune_1.2.1          
[37] tidyr_1.3.1          tibble_3.2.1         rsample_1.2.1       
[40] recipes_1.1.0        purrr_1.0.2          parsnip_1.2.1       
[43] modeldata_1.4.0      infer_1.0.7          ggplot2_3.5.1       
[46] dplyr_1.1.4          dials_1.3.0          scales_1.3.0        
[49] broom_1.0.7          tidymodels_1.2.0    

loaded via a namespace (and not attached):
  [1] later_1.3.2            urltools_1.7.3         triebeard_0.4.1       
  [4] hardhat_1.4.0          pROC_1.18.5            lifecycle_1.0.4       
  [7] usdm_2.1-7             globals_0.16.3         lattice_0.22-6        
 [10] vroom_1.6.5            MASS_7.3-60.2          backports_1.5.0       
 [13] magrittr_2.0.3         remotes_2.5.0          httpuv_1.6.15         
 [16] sessioninfo_1.2.2      pkgbuild_1.4.4         DBI_1.2.3             
 [19] abind_1.4-8            pkgload_1.4.0          ipred_0.9-15          
 [22] lava_1.8.0             overlapping_2.1        listenv_0.9.1         
 [25] crul_1.5.0             testthat_3.2.1.1       units_0.8-5           
 [28] parallelly_1.38.0      commonmark_1.9.2       ncdf4_1.23            
 [31] PresenceAbsence_1.1.11 codetools_0.2-20       ggtext_0.1.2          
 [34] xml2_1.3.6             tidyselect_1.2.1       shape_1.4.6.1         
 [37] httpcode_0.3.0         farver_2.1.2           jsonlite_1.8.9        
 [40] e1071_1.7-16           ellipsis_0.3.2         survival_3.6-4        
 [43] iterators_1.0.14       tools_4.4.1            Rcpp_1.0.13           
 [46] glue_1.8.0             prodlim_2024.06.25     xfun_0.48             
 [49] usethis_3.0.0          withr_3.0.1            fastmap_1.2.0         
 [52] fansi_1.0.6            digest_0.6.37          timechange_0.3.0      
 [55] R6_2.5.1               mime_0.12              colorspace_2.1-1      
 [58] wk_0.9.4               markdown_1.13          utf8_1.2.4            
 [61] generics_0.1.3         data.table_1.16.2      prettyunits_1.2.0     
 [64] httr_1.4.7             htmlwidgets_1.6.4      whisker_0.4.1         
 [67] pkgconfig_2.0.3        gtable_0.3.5           timeDate_4041.110     
 [70] GPfit_1.0-8            furrr_0.3.1            brio_1.1.5            
 [73] htmltools_0.5.8.1      profvis_0.4.0          gower_1.0.1           
 [76] corrplot_0.95          rstudioapi_0.17.0      reshape2_1.4.4        
 [79] tzdb_0.4.0             curl_5.2.3             proxy_0.4-27          
 [82] cachem_1.1.0           DALEX_2.4.3            KernSmooth_2.23-24    
 [85] parallel_4.4.1         miniUI_0.1.1.1         warp_0.2.1            
 [88] s2_1.1.7               pillar_1.9.0           grid_4.4.1            
 [91] reshape_0.8.9          vctrs_0.6.5            urlchecker_1.0.1      
 [94] promises_1.3.0         xtable_1.8-4           lhs_1.2.0             
 [97] oai_0.4.0              cli_3.6.3              compiler_4.4.1        
[100] rlang_1.1.4            crayon_1.5.3           future.apply_1.11.2   
[103] labeling_0.4.3         classInt_0.4-10        plyr_1.8.9            
[106] fs_1.6.4               stringi_1.8.4          viridisLite_0.4.2     
[109] munsell_0.5.1          lazyeval_0.2.2         devtools_2.4.5        
[112] glmnet_4.1-8           Matrix_1.7-0           hms_1.1.3             
[115] patchwork_1.3.0        bit64_4.5.2            future_1.34.0         
[118] shiny_1.9.1            gridtext_0.1.5         slider_0.3.1          
[121] memoise_2.0.1          bit_4.5.0              DiceDesign_1.10   
dramanica commented 4 weeks ago

It looks like there is mismatch in dates between your presence tibble and the raster. If you could create a small reprex (cut down your raster to a smaller area and only have a few presences, so that you can share small-ish files, we can help you fix the problem).

zpmdal commented 4 weeks ago

Thanks for your quick response. I saved the SpatRastDataset as an RDS file (zipped before uploading) without any cutting down, as the area is not very big (Nova Scotian Shelf).

zpmdal commented 4 weeks ago

This is the presence dataset.

dramanica commented 4 weeks ago

Ok, so there were two issues. First, the time units of the time axis of the raster were not defined (which caused the function to crash). Second, when converting dates from the presence tibble, you need to use a different function, as you are not working with time BP (but time CE). Here is a little sample code that works on my laptop:

nscotia<-readRDS("~/Downloads/tidysdm_bug/SSTSpatRastDataset.Rds")
ncscotia <- terra::unwrap (nscotia)
time(nscotia[1], tstep="years")<-time(nscotia[1])
time(nscotia[2], tstep="years")<-time(nscotia[2])
time(nscotia[3], tstep="years")<-time(nscotia[3])
time(nscotia[4], tstep="years")<-time(nscotia[4])

pres <- read.csv("~/Downloads/tidysdm_bug/ASA_Contemp.csv")

foo <- sample_pseudoabs_time(pres,
 n_per_presence = 3,
 raster = nscotia,
time_col = "Year",
 lubridate_fun = lubridate::date_decimal,
  method = c("dist_min", km2m(5)))

Note the use of lubridate::date_decimal to convert your time in years CE to a date object.

I have also modified tidysdm to be a bit more helpful. So, if you run the dev version of sample_pseudoabs_time() on your raster without setting the time units, you would now get:

Error in sample_pseudoabs_time(pres, n_per_presence = 3, raster = nscotia,  : 
  the units of the time axis of the raster are not defined;
when using terra::time() use either POSIXct or Dates, or set tstep to 'years'

which should be a bit more informative than the ugly crash that you experienced. Let me know if this fixes your problem and we can close this issue.

zpmdal commented 4 weeks ago

Thanks! You solved my problem. Really appreciated your great work!