DrylandEcology / rSFSTEP2

R program that interfaces with the STEPWAT2 C code and runs in parallel for multiple sites, climate scenarios, disturbance regimes, and time periods
0 stars 1 forks source link

NA values generated for PPT_sd and NaN values generated for p_W_D within mkv_prob.in #137

Closed kpalmqui closed 5 years ago

kpalmqui commented 5 years ago

The mkv_prob.in files that were generated on the fly using rSFSTEP2 branch WeatherGeneratorInputFiles_from_rSOILWAT2 and STEPWAT2 branch Overhaul_WeatherGenerator contained periodic NA values for PPT_sd and a single NaN value for p_W_D for DOY 366.

I have documented this for 1 site (Site 7 of the 898 sites) for two climate scenarios (Current, RCP4.5.CanESM.50yrs). These outcomes were consistent for both climate scenarios.

mkv_prob_Site7_Current.txt

mkv_prob_Site7_RCP4.5.CanESM.50yrs.txt

These files were generated using ‘dbW_estimate_WGen_coefs’ in rSOILWAT2 branch enhancement_65_EstimateWeatherGeneratorCoefficients, thus I will open an issue there.

dschlaep commented 5 years ago

See https://github.com/DrylandEcology/STEPWAT2/commit /6d9a799f6f5d4f13831e2677b134ac1d9a5394f5 for an example on how to deal with insufficient weather data to estimate the parameters:

dt <- rSOILWAT2::getWeatherData_folders(".", "randomdata", "weath", 1980, 2010)
res <- rSOILWAT2::dbW_estimate_WGen_coefs(dt)
# Warning message:
#  Insufficient data to estimate values for n = 2 DOYs: 212, 277
res[["mkv_doy"]][212, "PPT_sd"] <- median(
  res[["mkv_doy"]][seq(212 - 5, 212 + 5), "PPT_sd"], na.rm = TRUE)
res[["mkv_doy"]][277, "PPT_sd"] <- median(
  res[["mkv_doy"]][seq(277 - 5, 277 + 5), "PPT_sd"], na.rm = TRUE)
rSOILWAT2::print_mkv_files(mkv_doy = res[["mkv_doy"]], mkv_woy = res[["mkv_woy"]], ".")
kpalmqui commented 5 years ago

@dschlaep OK. I opened an issue in rSOILWAT2 for the NA issue with PPT_sd, which you can close or delete.

dschlaep commented 5 years ago

@kpalmqui the function dbW_estimate_WGen_coefs now offers imputation, see https://github.com/DrylandEcology/rSOILWAT2/commit/f7ab247155c16f4d9503beb893362616b79aa574

You can choose between the mean of X (e.g., 5 in the example below) neighbors or LOCF, see the documentation of the function. That is, replace lines 16/17 of R_program/MarkovWeatherFileGenerator.R with

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      imputation_type = "mean5", na.rm = TRUE)

or

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      imputation_type = "locf", na.rm = TRUE)

or alternatively by a separate call to impute_df:

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      na.rm = TRUE)
if (any(sapply(res, anyNA))) {
  res <- sapply(res, rSOILWAT2::impute_df, imputation_type = "mean", span = 5L)
}
kpalmqui commented 5 years ago

@dschlaep this is great, many thanks!

kpalmqui commented 5 years ago

@dschlaep tried running within rSFSTEP2 with call:

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      imputation_type = "locf", na.rm = TRUE)

which failed:

Impute missing `mkv_prob` values for n = 10 DOYs: 196, 208, 214, 217, 220, 224, 225, 227, 248, 271
Impute missing `mkv_prob` values for n = 11 DOYs: 182, 190, 196, 208, 214, 217, 220, 225, 227, 248, 265
Impute missing `mkv_prob` values for n = 12 DOYs: 196, 208, 215, 216, 217, 225, 227, 228, 239, 248, 252, 265
Error in { : task 1 failed - "object 'imputation_span' not found"

However, option imputation_type = "mean5"completed successfully.

dschlaep commented 5 years ago

@kpalmqui: sorry, I had to change the API again..., the argument imputation_type only worked for "mean5" if the method was meant to be "mean"

--> this is now fixed with https://github.com/DrylandEcology/rSOILWAT2/pull/139/commits/c9923c1b39e867c0c09739d80862b191880ca53c, but if you want to use the mean imputation method, then you would need to use the new argument imputation_span also; that is, in case you wanted a value different than the default of 5.

For instance, mean imputation using ±5 periods:

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      imputation_type = "mean", na.rm = TRUE)

mean imputation using ±3 periods:

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      imputation_type = "mean", imputation_span = 3, na.rm = TRUE)

locf imputation: no changes:

res <- rSOILWAT2::dbW_estimate_WGen_coefs(sw_weatherList[[s]][[h]],
      imputation_type = "locf", na.rm = TRUE)
kpalmqui commented 5 years ago

@dschlaep got it! will update rSFSTEP2 accordingly. Thanks much!

kpalmqui commented 5 years ago

Testing with rSFSTEP2 for 1 site and two climate scenarios with the updates to function dbW_estimate_WGen_coefs (https://github.com/DrylandEcology/rSOILWAT2/commit/c9923c1b39e867c0c09739d80862b191880ca53c):

imputation_type = "mean" : COMPLETED SUCCESSFULLY imputation_type = "locf" : COMPLETED SUCCESSFULLY imputation_type = "mean", imputation_span = 8 : COMPLETED SUCCESSFULLY

@dschlaep

Will now complete additional testing on Teton with more sites.

kpalmqui commented 5 years ago

Test runs for 7 sites for two climate scenarios have all completed without issue. These sites span dry to wet conditions.

kpalmqui commented 5 years ago

This issue has been resolved by new functionality incorporated into rSOILWAT2, NA and NaN values are now handled according to the imputation_type that is specified. Resolved by https://github.com/DrylandEcology/rSOILWAT2/commit/c9923c1b39e867c0c09739d80862b191880ca53c and preceeding commits.