Closed mitchellmanware closed 3 months ago
@mitchellmanware In process_nlcd
, I use terra::metags
to record year. Since GMTED and SEDAC data were produced in a single year, year could be added in the similar way with year value hard-coded. If we expect future updates in these datasets, it would be good to add year
argument in process_*
functions.
https://github.com/NIEHS/amadeus/blob/037412812aa82de7443b5dde7bd3600308310807/R/process.R#L680
@sigmafelix
When reviewing the process_
and calc_
functions, I have noticed that some of the calc_
functions you created accept only SpatVector
or sf
objects as locations.
Example is from calc_ecoregions
calc_ecoregion <-
function(
from = NULL,
locs,
locs_id = "site_id",
...
) {
if (!methods::is(locs, "SpatVector")) {
locs <- terra::vect(locs)
}
Is there a reason you do not use the process_conformity
function to accept SpatVector
, sf
, and data.frame
alike?
Could be:
calc_ecoregion <-
function(
from = NULL,
locs,
locs_id = "site_id",
...
) {
if (!methods::is(locs, "SpatVector")) {
locs <- process_conformity(locs = locs)
}
to accept all three classes.
See commit 062f448623296627bf3c6b0b4b96f015ac83c8ea.
Year/range metadata tag has been added for GMTED, groads, population, and Koppen Geiger process_*
functions and a $time
column for their calc_
functinos. For GMTED and SEDAC population, single year is returned (always 2010 for GMTED and variable for population depending on user-selected year).
For SEDAC groads, Koppen Geiger, and ecoregions functions, I have added the year range coverage as indicated by the datasets' descriptions. For example, SEDAC groads data was collected covering the period of 1980 to 2010, and is therefore added as a metadata tag and covariate column.
> ### sedac groads
> g <- process_sedac_groads(
+ path = "tests/testdata/groads_test.shp"
+ )
> calc_sedac_groads(
+ g,
+ l,
+ "id"
+ )
id time GRD_TOTAL_0_01000 GRD_DENKM_0_01000
1 3799900018810101 1980 - 2010 1.762476 0.5633273
> ### koppen geiger
> k <- process_koppen_geiger(
+ path = "tests/testdata/koppen_subset.tif"
+ )
> terra::metags(k)
year
"1980 - 2016"
> calc_koppen_geiger(
+ k,
+ l,
+ "id"
+ )
id time DUM_CLRGA_0_00000 DUM_CLRGB_0_00000 DUM_CLRGC_0_00000 DUM_CLRGD_0_00000 DUM_CLRGE_0_00000
1 3799900018810101 1980 - 2016 0 0 1 0 0
> ### ecoregions
> e <- process_ecoregion(
+ path = "tests/testdata/eco_l3_clip.gpkg"
+ )
> site_faux <-
+ data.frame(
+ site_id = "37999109988101",
+ lon = -77.576,
+ lat = 39.40,
+ date = as.Date("2022-01-01")
+ )
> site_faux <-
+ terra::vect(
+ site_faux,
+ geom = c("lon", "lat"),
+ keepgeom = TRUE,
+ crs = "EPSG:4326")
> site_faux <- terra::project(site_faux, "EPSG:5070")
> calc_ecoregion(
+ e,
+ site_faux,
+ "site_id"
+ )
site_id time DUM_E2083_0_00000 DUM_E3064_0_00000
1 37999109988101 1997 - 2024 1 1
>
Although this does not conform to the normal values in the $time
column, at least it is consistent with the original dataset.
HUC and OpenLandMap are the only datasets that do not include some sort of time information.
@mitchellmanware I think time
field is supposed to be working as one of keys. In the demonstration above, the time
field looks like a field with description on the time of representation in the source dataset. An advantage of using time
field as a key is that users will be able to join multiple calc_*
results with common keys. Could we move the source data description into a separate field with a name, for example, description
?
@sigmafelix Yes, that makes sense. I will update.
Update
> e <- process_ecoregion(
+ path = "tests/testdata/eco_l3_clip.gpkg"
+ )
> site_faux <-
+ data.frame(
+ id = "1",
+ lon = -77.576,
+ lat = 39.40,
+ date = as.Date("2022-01-01")
+ )
> site_faux <- terra::vect(site_faux, crs = "EPSG:4326")
> site_proj <- terra::project(site_faux, terra::crs(e))
> calc_ecoregion(
+ e,
+ site_proj,
+ "id"
+ )
id description DUM_E2083_0_00000 DUM_E3064_0_00000
1 1 1997 - 2024 1 1
Review "static" data set functions (ie GMTED, population, groads, NLCD) to ensure time information is returned. All objects should have time orientation, even if not frequently updated. May require hard-coding of year.